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ABSTRACT 



This study examines the error rate of the University of California Union 
Catalog Suppleroent t a computer produced book catalog of the materials cata- 
loged by the nine University of California campuses during the years 1963- 
1967, The catalog, published in 1972, consists of 47 volumes of approximately 
850 pages each and is divided into a 31 volume Author/Title section and a 
16 volume Subject section. This study attempts not only to determine the 
rate and nature of error in this particular catalog, but also to provide a 
general methodology for studying error rates in large bibliographic files, 
whether computer produced or not, 

A stratified random sample of 94 pages (5,900 entries) was taken from 
the catalog. The pages were thoroughly examined by two of the authors and 
each error discovered was analyzed according to six aspects: type, location, 
effect, cause, language, and non-monographic type. A total of 4,338 errors 
were found in the sample. This represents an error rate of 46,1 errors per 
page or 0,74 errors per entry, Ilie sample from the Author /Title section 
of the catalog (where main entries consist of a complete bibliographic record) 
contained 3,167 errors, or 0,88 errors per entry. The sample from the 
Subject catalog (where entries consist only of subject heading, author, short 
title, date, location code & call number) contained 1,171 errors, or 0,51 
errors per entry. Errors were categorized according to the degree of 
seriousness of their effects: minor, serious, and fatal. Minor errors 
made up approximately one half of all those found. Serious errors made up 
about 43% and fatal errors totaled approximately 7% of the error found in 
the sample. 
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I. INTRODUCTION 



A, BACKGROUND 

The University of California Union Catalog Supplement (hereafter re- 
ferred to as UCUCS) is a computer-produced book catalog of the monographic 
materials cataloged by the nine University of California campuses during 
the period 1963-1967. The catalog was intended to serve not only as a 
finding tool but also as a complete bibliographic record of the items cata- 
loged during that five-year period. It is divided Into a 31-voluine Author/ 
Title catalog and 16-volume Subject catalog* All subject entries appear 
in the Subject catalog, and all other added entries and main entries appear 
in the Author/Title catalog. Added entries in the Author/Title section are 
in brief form and refer to the main entries which are in full bibliographic 
detail. 

The first step involved in produc.lr^g the catalog was the collection of 
the main entry catalog cards for monog..aphs cataloged during 1963-1967 from 
eight of the nine U.C. campuses. (One of the campuses, Santa Cruz, sent 
its records in machine readable form on magneMc tape instead of in catalog 
card form.) The collection of cards was performed by the Institute of Library 
Research (ILR) over a period of years. Over 1.1 million catalog records were 
collected and processed, constituting approximately 750,000 unique titles. The 
collected records were inspected before processing in order to remove those 
containing non-Roman alphabet characters. The remaining records were given 
a unique serial number (including a campus code) and then microfilmed to 
guard against fire or other loss. One of the functions of the unique identi- 
fication number was its subsequent use in verification of the accuracy of 
the keyboarding process. The numbered records were then sent to a commBrcial 
vendor for Optical Character Recognition (OCR) typing. They were not for- 
matted or edited before they were sent to the keying vendor; the keyboarding 
was done from the original catalog records, unaltered except for the identifi- 
cation number which had been stamped in the upper right cortier of each card. 
Some rather gross delineation of parts of the catalog record was done by 
the typists in the process of keying the records. Their instructions were 
simply to key all the characters consecutively in order of appearance on 
the card J starting with the identification number, continuing with the call 
number and then the text of the record. The major data regions of the record 
were to be delineated by the typists by means of slashes ("/"), cross hash 
marks ("#") and plus signs ("+"). A slash was to be keyed at the start of 
each new paragraph indention on the card (data elements so marked included 
entry heading, start of title statement, collation, notes, tracing, etc.); 
a cross hash mark was to be keyed after the call number; and a plus sign was 
to be added at the end of the record. The typists were not expected to 
possess prior knovrledge of the structure of a bibliographic record (other 
than whatever familiarity they may have had resulting froir. their own personal 
use of libraries), and were instructed to recognize the c:^.ata elements purely 
on the basis of their locations on the card. (A copy of th^ in^itructions for 
keying the records may be obtained fran ILR.) 2 
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Since the UCUCS catalog was to include records of interest to many 
libraries in the academic community, it was important that the records be in 
MARC II communications format. An Automatic Format Recognition (AFR) program 
was developed by ILR to tag the bibliographic data fields of the records 
according to a subset MARC II format after they were keyed and returned to ILR 
in pre-AFR magnetic tape form. (For a description of the AFR program, see 
"Automatic Format Recognition," by Liz Gibson. 3 For a comparison of U.C.'s 
AFR program with other methods, see Brett Butler's article, "Automatic Format 
Recognition of MARC Bibliographic Elements: A Review and Projection. "4) These 
programs were being written at the same time that the Format Recognition programs 
were being developed by LC. There were no programs in operation at that time 
that could have been used by U.C. instead of developing their own programs. 
Tlie AFR program depended on the accuracy of the insertion of the slash 
and cross hash delimiters by the typists, as well as the accuracy of the 
typing itself. The AFR program relied not only on the data element deli- 
miters but also on the content of the data fields. The keyboarding stage 
of the process was, therefore, very important for the success of the AFR 
program, and it was upon this program that subsequent computer manipula- 
tions of the records depended (sorting, merging, consolidation, etc.). 

Although keyboarding accuracy Was recognized as being very in^ortant, 
there was no key verification stage in the processing of UCUCS records. 
However, the keying vendor's contract stipulated a keying error rate not 
to exceed .5% (.005) of the keystrokes. According to the contract with 
the vendor, if this error rate were to be exceeded for any batch of records, 
either the vendor would rekey the batches which showed an excessive amount 
of error or else receive a reduced payment for the erroneous batch. The 
agreed upon method of determining the number of errors in a record and 
the error rate in batches of records was as follows. It was estimated 
that each record contains an average of 400 characters, excluding the 
delimiter characters to be added by the keyboarders. In determining the 
number of errors in a record, the text was divided into five-character 
strings. Each group of five characters was examined as a unit. If no 
error occurred in a five-character unit, then that unit was considered 
error free; if any error occurred in a five-character unit — regardless of 
what sort of error occurred, or how many actual keystrokes were involved — then 
that unit was considered to contain one error. Each record contained an 
average of about 80 five-character units. Five such records would be made 
up of about 400 units. The .5% error rate tolerance therefore meant that 
ILR would tolerate an average of two errors in 400 units or five records. 
In determining the vendor's error rate, ILR operations staff members took 
3% random samples of records in each batch of 1,000 keyed records. These 
were printed out from the magnetic tape and compared with images of the 
original source records stored on microfilm. Errors were counted as described 
above. Batches which were determined by this sampling inspection to have a 
keying error rate in excess of .5% were to be returned to the vendor to 
be rekey ed. 

However, the rekeying compounded the quality control problem still 
further. Subsequent saiiq)ling showed that the error rates of many of the 
rekeyed batches were even higher than the error rates prior to rekeying. 
Moreover, shipping the boxes of cards back and forth to the vendor (located 
in Ohio) took a great deal of time. 
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Due to time delays and the fact that the error rates of the recycled 
batches of records were as bad as or worse than those of the original 
keying attempts, the practice of returning to the vendor batches containing 
excessive error was soon discontinued. Instead, ILR accepted whatever 
was produced by the vendor and reduced payment proportional to the error 
rate in accordance with the stipulations of the contract* This choice was 
dictated almost entirely by time constraints on UCUC3 production. If the 
contracted date when the data base was to be delivered to the photo-compo- 
sition vendor could have been extended, it might have been possible to 
absorb the time delays created by shipping defective batches of records back 
and forth to the keying vendor. Since this was not possible, the actual 
keying error rate for the full UCUCS data base was bound to be in excess of 
•5%. The actual keying error rate was finally estimated to be an average 
of 0.54%, according to the operational definition of "error rate" agreed 
upon by ILR and the keying vendor. 

The effort to control the quality of the keyboarding was ot^e of several 
planned methods of quality control which were either only partially success- 
ful or else never implemented at all because of various influencing factors. 
It was never ill tended by the Project Manager that the data base be exliausti- 
vely manually proofread or edited. To do so would have been impossible given 
the then-prescribed time and monetary constraints of the UCUCS project. 
Moreover, a major purpose of the project, according to the Project Director, 
was to experiment with producing a book catalog with a minimum of manual 
intervention and with an error level that was supposedly agreed in advance 
as one of the product specifications. As much of the catalog production as 
possible was to be performed by the computer processes with only limited 
human inspection, including quality control. Three other major quality 
control efforts were originally planned, all to be executed mechanically with 
a minimum of human intervention. Only one of these was actually utilized, 
however.^ and it was used on only part of the data base. 

It was intended that the Harvard Shelflist, available in machine-readable 
form, be used to derive a baseline name authority list of about 260,000 
entries against which all author names in the UCUCS data base would be 
matched. Exact matches would be assumed to be spelled correctly. Author 
names which did not find a match in the authority list would be printed out 
for a manual check. This plan was never implemented because the software 
was incomplete by the time photocomposition began. 

It was also intended that all English langtiage words in title fields 
would be checked against an authority list. The authority list in this 
case was the shorter Oxford English Dictionary, available in magnetic tape 
form, with a total of about 75,000 words. A program to perform this 
matching operation was written by a Research Assistant who was a doctoral 
student in the Electrical Engineering and Computer Sciences Department at 
Berkeley. This program operated on the assumption that exact matches with 
the authority list were spelled correctly and that non-matches were mis- 
spelled. Words thus identified as misspelled would be printed out. In 
addition, the computer would atten5)t to predict the correct spelling and 
this too would be printed out. If visual inspection of the printout showed 
that no change was necessary, the proofreader would make no textual changes 
and merely submit a control card. After the next input of the data the 
computer would automatically accept the word it had initially identified 
as being misspelled. If the proofreader determined a different spelling of 
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the word to be correct, this spelling would be noted by the proofreader and 
then keypunched and fed into the computer to override the "automatic" 
correction. In effect, the computer program attempted to automatically 
correct the entries, subject to manual override and control. The program to 
perform this operation was written, and much of the debugging had been done. 
Unfortunately, the program was never used, due primarily to time constraints. 
Also, because the program was so complex and sophisticated, it was not 
certain that there was sufficient money available to pay for the computer 
time which would be necessary if it had been used in production. 

A third mechanical quality control effort was used in production to 
a limited extent. This involved comparing the subject headings which would 
appear in the Subject catalog with a machine-readable authority list. The 
authority list in this case consisted of the machine file equivalents of 
LC Subject Headings List , 7th edition and the first annual supplement, 
plus approximately 50,000 subject headings, verified as legitimate U.C« 
headings not included in the U.C. list, plus about 3,000 of the geographic 
headings used in the U.C. Berkeley Library main catalog. ^ This comprised 
an authority list of about 120,000 entries. Similar to the title proofing 
program discussed above, this program would print out non-matches along with 
the near-matches which the program determined to be the correct form of the 
misspelled subject heading. The computer's suggested change would be made 
automatically if the proofreader did not override it and replace it with 
another spelling of that subject heading. In this process a number of 
variant forms of subject headings (such as "...Descr. and Trav." instead 
of "... Description & Travel") were normalized so that more entries would 
be brought together under one authenticated form of the subject heading. 
All of the subject headings in the total UCUCS file were run through the 
machine process. Approximately 40% of the needed changes, amounting to 
1,058,072 changes, were ultimately made in the UCUCS Master File — again, 
limited because of time and monetary constraints. 

All other operations performed to produce UCUCS were done by computer 
rather than manually. After the records were keyed and automatically 
tagged by the AFR program, duplicate entries were automatically consolidated 
into a machine-readable card set equivalent created for each union record, 
including all appropriate subject and added entries. The file was then 
exploded into all those entries which would comprise the Author/Title catalog 
and all those which would make up the Subject catalog. Finally the entries 
in each part of the catalog were sorted and merged into sequence, and the 
two parts of the catalog went through the line-and-column makeup, then page 
composition and printing process. 

In summary, most of the source records went all the way from key- 
boarding to page printing and binding without manual editorial intervention 
at any point in the production cycle. This was a management decision that 
resulted in a high error rate (mostly because it was not possible within time 
and budgetary constraints to implement several of the planned programs and 
procedures) but with a lower unit cost than had been experienced by any other 
equivalent book catalog conversion effort before or since UCUCS. The unit 
cost of approximately $1.16 per record includes such processes as keyboarding 
and optical-scan reading of source records; formatting of source records; 
development and operation of an authority control file improvement system; 
consolidation of duplicate entries; formatting for videocomp composition; 
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and composition, printing, and binding of 250 copies of a catalog of 42,000 
pages. 



B. REASONS FOR THIS ERROR STUDY 

There are five reasons for the present study. 

i) UCUCS has been the object of much criticism due to the many errors 
in it. Much of the criticism has come from U.C. librarians who 
would like to rely on such a tool in their daily work. This criticism 
demonstrated the need for an objective measurement of the errors in 
the catalog. A goal of the present study was to provide a 
definitive statement of the nature and extent of the error in 

UCUCS. Consequently, both the rate and types of errors have been 
carefully analyzed in this study* 

ii) A second reason for studying the errors In UCUCS was to inq)rove 
the present machine-readable file from which the book form UCUCS 
w&s made. The file needs to be cleaned up for use by the U.C. 
Bibcenter for on-demand catalog card production and other applica- 
tions that will use the UCUCS machine records. Moreover, the UCUCS 
file can serve as a valuable resource data base for subsequent use 
by libraries outside the University of California system. The 
fewer the errors in the file, the greater its value. 

iii) The programs used to produce UCUCS also need iin>roveinBnt for 

use by the Unlverslty-wide Library Automation Program for sub- 
sequent bibliographic processing efforts. A careful, definitive 
study of the errors in UCUCS can facilitate such program in5)rovement. 

iv) A fourth purpose of the study was to identify needed changes 
in locel U.C. library procedures. Some of the error In the 
catalog is due to failures in preparation of the records prior 
to handling by the ILR staff. Some of the error is also due 
to variant cataloging practice among the nine U.C. caiiq>uses. 

This study is Intended to help delineate areas where consideration 
could be given to improvement in local procedures. 

v) A fifth important motivation for a study of this sort was a 

need for a general methodology for measuring errors in bibliographic 
catalogs — machine produced or otherwise — and in machine-readable 
bibliographic files. As far as we have been able to determine, 
no such generalized methodology has yet been designed. The 
present study analyzes both rates and types of errors In a 
particular catalog; in the process of doing this, a taxonomy of 
error types has been developed which should be applicable and 
useful In the error analysis of any bibliographic catalog or 
similar file. 



C. RELEVAlM: LITERATURE 

In searching library literature, the authors sought Information re- 
garding studies of errors In computer-produced book catalogs or other 
large catalogs. Although there are a few studies which consider the problem 
of error in such catalogs, none of them approaches the complexity of 
analysis attenmted here. 

14 

5 



The following literature was searched: Library Literature , 1967-June, 
1974; Annual Review of Information Science and Technology , 1966-1973; and 
ERIC 1966-Decetnber 1974, The literature examined falls principally into 
three groups: 

Design-oriented or descriptive discussion 

Discussion of filing rules for machines and the problems caused 
thereby 

Evaluation of error in existing machine-readable printed biblio- 
graphic catalogs. 

Much has been written regarding the design and implementation of 
machine conversion of files to book catalog format. Most of the literature 
seen either was of this type or described projects in process. The 
latter descriptions almost never included evaluations of error in the project 
discussed. 

Some of the discussion about filing problems caused by computer filing 
is of value, since it helps explain the type and effect of a large number of 
the errors analyzed in this study. Cartwright's paper, "Mechanization and 
Library Filing Rules," discusses some of the filing problems which can occur 
in computer-produced book catalogs. ^ He suggests that filing problems should 
be considered at the time of file conversion, not after-the-fact. He also 
observes (correctly, in the authors' opinion) that standardized spacing and 
punctuation are essential, and that the lack of such standardization can lead 
to serious problems in filing and in the consolidation of entries. 

Cartwright describes two book catalogs produced by computer: the book 
catalog at Florida Atlantic University and the Meyer Undergraduate Catalog 
at Stanford University. These systems had difficulty in filing entries 
in their proper order; for example, honorary titles such as "Sir" were 
used as filing elements, in contrast with the A.L.A. Filing Rules, which 
ignore titles in filing. No measurement of error is provided in this 
report. 

A paper by Joe E. Hewitt is one of the few which actually reports 
and evaluates an error rate.7 Hewitt points out the importance of per- 
forming catalog error studies not only for many uses by the local library 
staff but also for publication and use by other libraries in evaluating 
their own catalogs. He, like the present authors, decries the scarcity 
of reports of catalog error studies in library literature: "... the un- 
fortunate result is that the research library performing an audit does 
so in a vacuum." His paper breaks this vacuum's seal by publishing a 
filing error rate of 1.1 percent in the University of Colorado's Norlin 
Library author /title card catalog of approximately 1,000,000 cards. 
Hewitt succinctly describes the general methodology used in arriving at 
this error rate and reports other statistical data. He also cites a 1953 
audit of the official catalog of the Library of Congress reporting a filing 
error rate "in excess of 5 percent." 

The third group of literature includes evaluations of errors in 
coB^uter-produced catalogs. Some of these studies are solely concerned 
with cost-benefit analyses. Other studies evaluate the iiiq)act of a book 
catalog on the user, but error in such products is being ignored, or else 
no error exists. 
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Books in Print , 1969, a computer-produced catalog, has been analyzed 
for error in an article by N, Cambier, et al, , "Books in Print 1969: An 
Analysis of Errors ."^ A sample of 2,000 entries was chosen; the BIP entries 
were then compared with Publishers' Trade List Annual to detemine any dis- 
crepancies. After excluding factors such as foreign names and variant 
spellings, errors were typed as follows: Author omission, author error, 
title omission, title error, date omission, date error, price omission, 
defined; however, an error rate of 8,G percent was reported. Although 
this study has a relatively simple methodology, it is nevertheless the 
study most comparable to the UCUCS error study. 

Another study of some relevance to UCUCS is the one made by J,L, 
Dolby, et al», "Efficient Automatic Error Detection in Processing Bibli- 
ographic Records, "9 This report discusses an "error study" undertaken to 
determine the priority to be given procedures involving automatic error 
detection programs, Saisples or unstated size and composition were dravn 
from Harvard University's Widener Shelf List and from Stanford University's 
Meyer Undergraduate Catalog , The samples from both libraries were 
proofread and corrections Indicated on a computer output sheet. These 
sheets provided the data for the study. 

No statement was made about the actual rate of error found. The 
analysis was aimed primarily at comparing the types of errors prevalent in 
the two samples and considering the possibility of automatic error detection. 
Errors were categorized into three groups: 

Sequence errors 

Missing Information 

This group has six subdivisions, including: spaces, 
punctuation, capitalization, and diacritics. 

Incorrect information 

This group has nine subdivisions. Including; j>:^oper names, 
words, punctuation, format, capitalization, ; :> ^ diacritics. 

As can be seen, these categories are considerably broader than those used in 
the UCUCS study. Correlations were not made beween type of error and cause 
of error, nor was an attempt made to determine the location of error 
within the entry. 
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II. SUMMARY 



A. METHODOLOGY 

A stratified random sample of 94 pages from UCUCS was used in this study 
— 61 pages from the Author/Title section of the catalog and 33 from the 
Subject section. Each of these pages was Xeroxed and read by one of the 
authors who marked the copied page with red pencil and recorded each error 
found on an error coding sheet. Each page was re-examined by another one of 
the authors in order to catch any errors missed by the first person and also 
to insure continuous standardization of interpretation of the error codes. 
For each sample page, the two adjacent pages were also Xeroxed. These ad- 
jacent pages were not included in the sample in any way but were used to help 
find errors actually occuring on the sample page which would be impossible 
to notice without seeing the adjacent pages. 

The analysts used an inclusive definition of error. An error was con- 
sidered to be not only problems which caused loss of entry point or misfiling 
of entries, but also anything which might cause confusion or irritation on 
the part of the catalog- user . Therefore, relatively minor mistakes such as 
imprope\: spacing or print size were included as errors in this study. The 
"catalog user" was considrsred to include not only professional librarians 
but also students and the general public. 

Although many kinds of errors were identified in the study, they were coded 
in such a way that they could be grouped later into three general categories 
of "fatal," "serious," and "minor;" thus relatively insignificant errors could 
be evaluated separately from more serious ones. Fatal errors were defined as 
those which would make it very likely that an entry point (i.e., a biblio- 
graphic record) would be lost to the user. Serious errors included non-fatal 
errors which would make it fairly likely that an entry point might be missed 
by the user and errors which render the content of the record unclear or 
misleading. Minor errors including those which merely affect the appearance 
of the entry without being likely to cause confusion for the user. 

Each error found was coded according to six different aspects: Type, 
Location, Effect, Cause, Language, and Non-Monographic Type. Using each of 
these aspects in recording the errors made it possible to obtain a rather 
realistic and specific idea of the nature of the errors in UCUCS, and this, 
in turn, will enhance the efforts of programmers and systems analysts to 
improve the data base and the programs which produced the catalog. 

After all of the errors had been coded, the collected data was key- 
punched, and most of the data reduction was done by computer. 

B. RESULTS 

A total of 4,338 errors were found on the 94 sample pages of 5,900 
entries (3,589 entries in the Author/Title section and 2,311 in the Subject 
section). Tliis represents an average of 46.1 errors per page, or 0.74 
errors per entry. There were 3,167 errors in the Author/Title section and 
1,171 errors in the Subject section. We can estimate that there are an 
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average of 0.88 errors per entry in the Author/Title section and 0.51 errors 
per entry in the Subject section. Generally, however, errors tend to 
"clump" in some entries rather than being evenly spread throughout the 
entries. Thus, many entries showed no errors. We can also estimate that 
there are approximately 51.9 errors per page in the Author /Title catalog 
and 35.5 errors per page in the Subject section. 

If these figures seem a bit alarming, it must be remembered that they 
include all sorts of errors, even very minor ones. The serious and fatal 
errors together represent only about half of all the errors. The serious 
errors totaled 1,886 representing 43.5% of all found, and the fatal errors 
totaled 300, representing 6.9%. There were 2,152 minor errors in the sample, 
or about 50% of all the errors found. 
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III. METHODOLOGY 



A. THE SAMPLE 

Using the Rand Corporation Tables of Random Numbers, a stratified sample 
of 61 pages from the Author /Title catalog and 33 pages from the Subject catalog 
was selected. The Author/Title catalog consists of about 26,900 pages in 
31 volumes, and the Subject catalog consists of about 13,900 pages in 16 
volumes. The sample selected represents an average of about two pages for 
each volume in the catalog, or approximately .23% of the entire sampling universe. 

Each of these 94 pages (a list of the sample pages is given in Appendix A) 
was Xeroxed, along with the page iiranediately preceding and the page imme- 
diately following it. These two adjacent pages were not used as part of the 
sample but merely as aids in determining the errors actually occurring on 
the sample pages. In some cases an error would not be noticeable on an 
isolated page without referring to one of the adjacent pages. But in no 
case were error? appearing on the pages adjacent to the sample pages tallied 
in our study; only errors actually occurring on one of the 94 sample pages 
were analyzed and counted. 

B. METHOD OF DETERMINING AN ERROR 

An error may be an error may be an error, but whAt constitutes an error 
is still a matter of opinion. We have no doubt that what we have decided to 
include as an error may be considered by others as too trivial to be included 
or even not an error at all, and that we may have excluded instances of what 
others would term an error. This certainty arises from the experience of 
disagreement between the authors on tiie question of what should count as error, 
nor;, to mention the disagreement within our respective minds at different times. 
The guiding principle which evolved through our discussion was to be as inclusive 
as possible within the realms of reason: to tally as an error everything from 
that which might cause even mild confusion or irritation on the part of the 
catalog user to those errors which are very serious and almost certainly result 
in a lost entry point. (Of course, we did not hold the Catalog responsible for 
those factors inherent in any bibliographic catalog which might be confusing 
to the user, such as arbitrary but typical nuances in filing rules.) "The 
user" was considered to include not only professional librarians but also students 
and the general public who might use the catalog. We included as error 
relatively minor typographical errors such as misplaced umlauts or accent 
marks, relatively minor spacing problems which might result in confusion or 
just difficult reading, such as the absence of a space in appropriate places 
in the collation statement (e.g., 261p. instead of 261 p.), and apparent 
inconsistencies in printing format which make the text difficult to read. 
Anticipating the disagreement with this inclusive policy, we have attempted 
to record our data and report our results as specifically as possible so 
that readers can more readily evaluate our conclusions in light of their 
own opinions and definitions of "error." 

It should be noted that this inclusive policy of error determination 
will tend to make the overall error rate higher than would a less critical 
policy, and that it will also result in a relatively lower percentage of 
"fatal" and "serious" errors compared with the third category, "minor" errors. 
The authors defined fatal errors to be those which would make it very likely 
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that an entry point (i.e., bibliographic record) would be lost to the user. 
This category includes those errors which obviously involve a lost entry 
point and also entries which had been misfiled onto a page other than the 
one where they should have appeared (according to ALA filing rules). Serious 
errors are defined to include non-fatal errors which would either make it 
fairly likely that an entry point might be missed by the user (such as an 
entry misfiled on the correct page but in the wrong columo) or errors which 
tender the content of the record unclear or misleading. Minor errors include 
those which merely affect the appearance of the entry ^^ithout being likely 
to cause confusion for the user. 

To arrive at a consensus on what we wanted to include as errors and sub* 
sequently to ensure consistency in our use of the error categories, a 
duplicate set of 31 of the 94 sample pages (every third page) along with its 
two adjacent pages was Xeroxed. Two of the authors (Todd and Sommer) then 
examined a few of the pages independently. Each entry was nuisbered by starting 
with the first entry in each column as number 1 and proceeding through the 
remaining entries in that columa. The errors were first marked in red pencil 
on the sanq)le pages, (See Appendix B for illustration*) They were then 
coded on a data sheet according to our respective understandings of the first 
version of the error categories. The results were compared and discussed; 
revisions were made in the coding sheet, and: definitions were clarified. 
More pages were similarily examined and results compared; more discussion 
and revision ensued. This process continued until an error categorization 
very near the present form was worked out. The remainder of the sample was 
then examined by both of these reviewers and errors were recorded- One 
further revision in the error categorization was subsequently made (an 
elaboration in the Effect categories 50 through 7Z described later) • All 
entries coded according to earlier version were, of course, recorded according 
to the present form of the error categorization. 

Each saiiq)le page (and its adjacent pages) was examined at least twice, 
once by each analyst. Of course the pages used in deriving the error cate- 
gorizations were read in part more than twice. It was noticed that in looking 
over these pages more than twice, rarely were errors found that had been 
"missed" in previous examinations. It is therefore the opinion nf the authors 
that examining each sample page once by each of two people is a reasonably 
satisfactory method to catch the vast majority of errors. Errors caught in 
subsequent readings will probably be minor and the effort of additional 
proofreading will not be worth the added e3q>ense. 

Because the authors were not familiar with all the languages used in 
the cat.alog, a number of entries were examined by other people with the 
guidance of one of the authors. One of the present authors has an adequate 
reading knowledge of Spanish and Portuguese. Entries in many other langugages 
were checked by other ILR staff members and some Library School students. A 
total of 152 entries of our sanq>le remained partially unanalyzed because 
they contained words in languages for which we had little linguistic capability 
(e.g., Turkish, Esperanto, Indonesian). However, even the^a entries were ex- 
amined for errors which could be noticed without knowing the foreign language 
used. That is, most such entries contain some English words (at least in the 
tracings, notes, etc.), and it was not usually necessary to know the language 
in order to check for non-consolidation problems, extra blanks, etc* 
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Type III. 
10 Duplicate data not suppressed 
20 Variant form of sama data or ed* 
30 Orthographic inaccuracy 

31 transposition 

32 laissing string 

33 added string 

34 meaningless string 

35 missing or added blank (s) 

36 incorrect or missing caps 

37 other misspelling or undetermined 
40 String iii^)roperly used in filing 

41 function term 

42 dates 

43 associated title 

44 English article 

45 non-English article 

46 other 

50 Data field missing 
6 0 Data field added 
70; Inappropriate entry 
80 Other 

II. Location 

10 Main entry 

11 entry heading 

12 title statement 
lA (short) title 
IB subtitle 
IC author statement 
ID other 

13 edition statement 

14 place 

15 publisher 

16 date IV. 

17 collation 

18 notes 

19 tracings 
IX call no. /location 
lY other 

20 Added entry 

21 heading 

22 title statement 

23 date V. 

24 see reference 

25 call no. /location code 

26 other 
30 Subject entry 

, 31 subject heading 

32 entry heading 

33 title statement 

34 date 

35 call no. /location code 

36 other 

VI. 
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FIGURE 1. FINAL VERSION OF 



Effect 

10 Lost entry point 

20 Misfiled (not non-consol.) 

21 same col.)w/ 24 same col.)w/o 

22 same page)hdg. 25 same page)hdg. 

23 other ) 26 other ) 
30 Content uncertain 

40 Wasted space (not non-consol.) 
50 Non-consol. of entry 

51 same col.)w/ 54 same col.)w/o 

52 same page)hdg. 55 same page)hdg. 

53 other ) 56 other ) 
60 Non-consol. of heading only 

61 same col.)w/ 64 same col.)w/o 

62 same page)hdg. 65 same page)hdg. 

63 other ) 66 other ) 
70 Non-consol. of subj. heading: 

— counting subj. hdg. space 

71 same col.) subj. 77 same col.) whole 

72 same page) hdg. 78 same page) en try 

73 other )non-con. 79 other )non-con 

74 same col.) en try 

75 same page) hdg. 

76 other ) non-con. 

— not counting subject hdg. space 
7A same col.)whole 7P same col.) entry 
7B same page) entry 7Q same page) hdg. 
7C other )misfile 7R other ) non-con 
7E same col.)body 7X sarae col.)eutry 
7F same page) only 7Y same page)non- 
7N other )misfile 7Z other )con. 

80 Appearance only 

90 Other or unknown 

Cause 

10 Keying error 

20 Variant cataloging practice 

30 Prograi:a processing 

40 Processing (not keying or prog.) 

50 Inadequate design 

60 Other 

70 Unknown 

Language 
10 English 
20 German 
30 French 
40 Spanish 
50 Italian 
60 Latin 

70 Other Roman alphabet 
80 Transliterated 
90 Other 

Non-monographic type 
10 Monographic series 
20 Serial 

30 Music score, map, other 
ERROR CODE CATEGORIES 
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Figure 2. Error Coding Data Sheet 
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The following section discusses the way the errors were finally categorized 
and coded. 



C. ERROR CATEGORIZATION 



Figure 1 shows the error categories used in this study and their numeric 
or alpha-numeric codes. Figure 2 shows an example of an error coding data 
sheet. The errors are recorded according to six general aspects: Type, 
Location, Effect, Cause, Language, and Non -Monographic Type. With one ex- 
ception, each error was assigned only one code in each of the six aspects. 
Five of the aspects were suggested by Douglas Ferguson in his design paper. 
The particular elements within this general categorization of error aspects 
have gone through at least 15 revisions during the course of our study. 

Our goal was to produce a structure which would allow the categorization 
and recording of UCUCS errors in a useful, thorough, and unambiguous manner. 
This goal necessitated these six aspects and their detailed contents. Although 
we have no doubt that improvements could still be made in our error category 
structure, we believe the one presented here meets our goal and could also be 
readily modified to suit the purposes of other similar studies. 

Following is a description of the first aspect. Type. Some discussion of 
the Effect and Cause aspects will be given at the same time, since these three 
aspects are closely related and difficult to explain in isolation. Further 
description of Effect and Cause and of the other three aspects follows. 

1. TjTEe 

Aspect I is Type. One of the Types from 10 through 80 (see Error Code 
sheet. Figure 1) was selected as best describing each error encountered. It 
was nearly iilways possible to place an error within one of the specific 
categories (10-70), but occasionally an error did not fit into any of these 
and was therefore coded Type 80. 

Type 10 (Duplicate data not suppressed) was used when an entry should 
have been consolidated with another entry and there is no apparent reason for 
the failure to consolidate. Generally, some discrepancy between two uncon- 
solidated entries can be seen (for example, a typographical error or some 
difference in the way the two entries i^ere cataloged), but occasionally the 
entries appear totally identical. The two entries in the example below are 
identical and should have consolidated, so there is no difference in the 
data as the entries appear on the page. 

KING. Biwmi Loalt, 1920- 

— How chemical reactioni occur, an inlroduclion 
to chemical kinetics and reaction mechanisms. 
New York. W. A. Benjamin. 1963. Ml p. iliui. 33 
cm. (The Ocftml chcmirtry moMicrtph acrkal liicUidct 
bibliofraphy. 1 . Cheminl reaction. Rate of. QDSOI.K7S14 
341.39 6MtSt 1136414. 

QO 501 KS76K-1. Chtmlttry Ubrarr. 
Q0501 K612--«llodMm. Ubrwy; 
00501 K5S--D Q0501 KS4^ PHYS. SO.: 
00 501 M4-4i: QO 501 K52-«>: 
QD50V K52-W S A E. 

—How chemical reactions occur^ an introduction 
to chemical kinetics and reaction mechanisms. 
New York. W. A. Benjamin. 1963. Ut p. Mw. 33 
em. (The Ocnenl chcmiatfy moiwgrtph aenci) Incluika 
bibliomphy- l.ChcminlreKtion. RticoT. QDS01.K7SI4 
S4ll9 6MMa 1396744. 

Q0501 K52h 196)--ir: (y>501.K75U 
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Figure 3. Example of Error Type 10 — Duplicate Data Not Suppressed 



The fact that these entries should have consolidated Is reflected In 
the Effect category and the apparent (Imputed) reason for this (a failure 
of the consolidation program In this case) Is recorded In aspect IV-Cause* 

Type 20 (Variant form of same data or edition) was used when two or more 
entries represented the same edition of a work but were cataloged differently 
by the different campuses • (It might be argued that this Is not really an 
error In a technical sense.) Generally, In order for two works to be considered 
the same edition, they had to have the same author, title, publication date 
and collation statement. (The precise operational definition of edition as 
used In designing the prograaus and In this stiidy Is shown In Appendix C.) 
This error type may also occur when two entries should have consolidated; 
It may also be associated with a mlsflle. These factors are recorded In the 
Effect column. Generally this type of error was caused by variant cata- 
loging practice, coded as Cause 20. An example of error Tjrpe 20 Is shown 
below In Figure 4. 



OSTERHELD, Dora Miller. 

— Rcrcrencc syllabus for use in advanced 
rcrcrcncc claMcs. 1965. i^. CAVANAOH. OWyi.' 

Z 1035 C3Mr 1965-4.. 
OSTERHELD. Doni Miller Jojaf autlKw. 

— Reference syllabus. Tor use in advanced 
reference classes. 1965. 5*r:WISC0NSIN 
UNIVEWITY. SCHOOL OF LIBRARY SCIENCE. 

RZ1035 W57 l965-« 

—Reference syllabus for use in advanced, 
reference classes. 1965. J<r wiwoown (Sttie) 
Untvetiity Libfiry Schtml 

Z71 1 W56 1965-« Vtomy School. 

—Reference syllabus for use in advanced 
reference classes. 1965. 5*r Wi»c»ii«in. Univmii> 
Library School. 21035 W864 \9€^^ 



Figure 4. Example of Error Type 20 — Variant Cataloging Practice 

The Type 30s Include all sorts of typographical errors and are 
usually, but not always, caused by a keying error (Cause 10). Program 
processing (Cause 30) may also result in some of the Type 30s. Hiese 
error types are fairly self explanatory. Types 32 and 33 (missing string 
and added string) include missing or added pimctxaation but exclude missing 
or added blanks. Type 37 includes all instances of orthographic inaccuracy 
not covered by the earlier listed categories. The most typical instances 
of Type 37 were misplaced or incorrect accent marks and the replacement of 
a correct letter by an Incorrect letter. Note the example below which 
includes a number of types of orthographic inaccuracies. 
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Type 40 (String improperly used in filing) is also self explanatory. 
This category includes not only instances of improper filing of articles 
(Types 44 and 45) but also cases where function terms, such as "jt, auth,," 
"ed,," "comp,," or associated titles, such as "Sir" or "1st Baron," were 
used in determining filing sequence. In UCUCS, foreign language initial 
articles, all function terms and associated titles were considered in 
filing, frequently resulting in misfiling of such entries. The examples 
below illustrate some of the Type 40 categories. 



RICa.S«7now4c Ittl- 

•;--Cauiof lie d'une collecUon unique dea Mitkms 
origtnaJes de Roniard par Seymour de Ricco, 
1 925. Sit: Mhs> Bro*.. LMMfcm. 

^ 28757 23 M19-0. 

RICClT^ifMrt Elisa. 

— Mille untt neH'irte; prerizione di Corrado Ricci. 
Miiino, U. Hoepli. 1931. ». 734 p. illu»..^ic». 31 
cm. *Biblio|r«fb*. p. liii-xiv. I. Swnts— Art. 3, Chriitun ■rt 
ind lymboinm. J. Sainu. II649I7. N 8080 RS<-SO. 
—Peasant iri in Italy. 1913, 5**. Holme. Chirki 

l84M92J.ed NK959H6-0 



LABAILA y GoaiAJez, Jadato, 1S33-I895. 
— Le arte de Kacersc amar. ensayo c6mico original 
en un acto y en verso. Madrid. Jos« Rodrfguez. 
1858. 3t p. i card 121X99. 

Micro- card PQ 621 7— L 

rL'ABAILARDuppos^. 1780. Sk-. Be«uh«ma«i. 
Finny (Mouchird de chatMni comleuc de. 1 737- 1 81 3. 

p6 1955 B8 A7 Stahck-^. 



Figure 6. Example of Error Type 
43 — Misfile on Associated Title 



Figure ?• Example of Error Type 45 
— Misfile on Non-English Article 



Occasionally an entire data field (such as the date, collation, or title) 
would be missing from the entry. In this case. Type 50 was coded. Note the 
exaxaple below, in which the date is missing. Problems such as the example 
below could also be due to variant cataloging rules, but they were still coded 
as "data field missing." The specific example below also has a failure to 
consolidate because of spelling variations (of, on). 



MILLS. AbrftluuH. 1796-1867. ed. 

— Lecture! of rhetoric and belles lettres. "'jSrr.' Blur. 
Hu|h. I7ra^tt00 PE1402 B53 1833-A 

— Lcctmcs on tiietoric and belles leitres. 1860. 

5^-Btair. Huih. |7ll.|800 PE 1402 B6 1860— 
P£ 1402 B6 1783--W SP£CtAL COCLECTl0^fS 



Figure 3. Example of Error Type 50 — Data Field Missing 

Conversely, sometimes data fields appeared where they should not have 
appeared. This usually occurred in subject or other added entries. Added 
entries were intended to include only the following data elements: author 
(if present), short title, date, campus location code and call number. 
Therefore if a subtitle, publisher, or edition statement appeared in an 
added entry, it was recorded as an error of Type 60. This is not so serious 
an error as many other types, but it does frequently result in wasted catalog 
space (Effect 40). An error was determined to have the effect of wasted 
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catalog space only when it resulted In at least one additional line of space 
being used. Notice the exai^le in Figure 9. The first entry has two Instances 
of Type 60, the first with Effect 90 (Other or unknown) and the second with 
Effect 40 (Wasted space) • 



IH£ $nnnr«^^ m iatm^^ wiaiiift^ 
^tdlMw<tlii«tf*d«ctlMt.k|FPMtrI.S«it. 1967 



Figure 9. Examples of Error Type 60 — Data Field Added; 
Effect 40 — Wasted space 



When a Type 60 did not result in at least one line being unnecessarily 
used, the Effect category was coded as 90 (Other or unknown). Because the 
programs were intended to prevent these data fields from being printed but 
failed to do so in these cases, the Cause category is coded Cause 30 
(Program processing) . However, it was later recognized that this error was 
more likely caused by a program design limitation. The problem occurred 
because the automatic format recognition program was not designed to be 
100% perfect in delimiting $b subfield for subtitle, and there was no human 
post-edit to catch the deviations. 

UCUCS was expected to include only monographic materials; serials, 
music scores, phonodlscs, maps, and other non-mon<»graphic materials were 
to be excluded. Therefore, an instance of inclusion of such non-monographic 
material is in some sense an error or deviation from intent. Type 70 was 
coded whenever non-monographic material was encountered. This is an error 
not only because the catalog was intended to include only monographic 
materials but also because failure to exclude non-monographs wastes space 
in the catalog and therefore costs additional money and user time. All 
such cases were coded: Type 70, Effect 40, Cause 40. Cause 40 includes 
all processing which is not included in the keyboarding process or in 
program processing. In the case of inclusion of non-monographic materials, 
such entries should have been but were not excluded at some point in the 
manual procedure of selecting records to be sent to ILR or in selecting those 
to be microfilmed and forwarded to the keying vendor. An example of the 
most typical sort of Type 70, a serial entry, is shown on the following page. 
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G«MMc*tc mJP. (Mr. 1] If2S* WdMr. 
^-Condnuct Hinatche Ccschichuquelkfi. *Hnt. 
voM IN«iii>-im Oachidittvtfflifi.- SmpcimM. 1. Hum 

torn lomia 

00 HOI HtSV5-tGrouCoMtanBuft.Adtii.UbL. 
QUEIXISN DmteUai^ ur ZdC|Mcfeklit«. 
— Bd. Stuiipn. Deutsche VerlafS-Ansult, 

1957- V. 1. CcrmMr-Halovr-SoMrca. i73712t. 

00234 Qe-0. 



Figure 10. Examples of Error Type 70 — Inappropriate Entry 



Type 80 (Other) Included all error types which failed to fit Into any 
of the above categories. This category was used rarely^ usually for in- 
correct type size as in the exaoqple below. 



MEEnNC ay Ai t im M ii to tW Uknn; Winn, 
Wfe««jn4 Hmt, PwAm Uahmitr, S9M. P^ 



Moctm W 6rtBfar >\tfM. *f>iMDWd by IW pJrd» 

Cbapty S>«di* UbwfaTj&iiiikM: ud ihc lndiMia 
Oyiw rfrta Aimriciw DnriMiMiil in Hmkmt.' l.Ubntkt 
-AiaowwlMa I. Aadfm ThMien. ttf. R. Purdue 
U«h«firty Ufawtte. Ubrviet. T: AuiomMici in (he 
Ubfiry. 1 3717 1 4. 

. Z 678.9 M4 IMM^; 2«9«IM--«. 



Figure 11. Example of Error Type 80 — Other 



2. Location 

The second aspect by which each error was coded is II*Location. Errors 
occurring in a main entry were coded In one of the 10s; errors occuring in 
added entries In the Author/Title catalog were coded in the 20s; and errors 
occurring in the Subject catalog were coded in the 30s. Nearly always error 
could be located precisely in a specific part of the entry~for example^ In 
the short title of a main entry or In the see reference of an added entry. 
Sometimes y however^ an error could not be located specifically. For exanqple^ 
when an instance of duplicate data not suppressed (Type 10) occurs and two 
or more main entries are not consolidated (Effect 50) » then it is impossible 
to say that the error occurred in the entry headings the title statement^ 
edition stateisent^ etc.» but merely that it occurred in a main entry. In 
such cases » the LQ^ation code 10 was used. The same approach was used in 
analogous cases for added and subject entries. 

27 
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This particular aspect is straightforward and unproblematic with one 
exception. Generally, it was possible to reflect in one numeric (or alpha- 
numeric) code where the error was located; "where the error is located" 
ineant both "where it appears" and "where its source is" since normally the 
two types of "location" coincide. For example, a typographic error appearing 
in an edition statement (Location 13) is noticed there by the reader and 
actually occurs there in the machine readable form of the records Since 
the added entries and subject entries are generated frOTi the source record 
which appears in full as the main entry, those errors appearing in an added 
or subject entry are usxially traceable to an error in the main entry. This 
usually presents no problem in coding, however. Errors appearing in added 
or subject entries are coded In the 20s or 30s respectively in order to 
determine the error ratee in these types of entries. For example, an error 
below. 



^AncsthesU in clinicai ophtK^mdofy. 1963. 
DwcatL Dtmi. WO 200 0912a 
«R DM Oi^m Ubrvr. ^HfO^OO 
— Aneithe«U in clinical qphthiImoki|y. 1963. 



Figure 12. Example of Error in Which the Source of Probable 
Error Is Not Apparent from Sample Entries 



One can only determine from these entries that they may be Instances 
of the same edition of a work, and therefore it may be that they should 
have consolidated. The main entries must be checked to determine whether 
they are duplicates. Their respective main entries are shown below. 

DUNCALF, D«nrcii. 

— Anesthesia in clinical opbthalmolofv [by] 
Dcrydt Duncalf [and] David H. Rhodes. 
Baltimore. Williams i Wilkins, 1963. xviii. lu ^ 

ofrfiihalaiolflfy. iTllMdk 0««h1 H^Joini MilMr. R£t2.DI 
617 M iy-WJl linwfdinl Ubnry C 

RE82 D8-« OfKom. Ubrarr Mt>200 06-^. 
^Anesthesia in clinical ophthahnolosy. [by] 
Dcrvck Duncalf (and] David H. Rhodes. 
Baltimore. Williams • Witkins. 1 963. nrm, I6t p. 
iltiM. I. AnmlMtU. 2. OpMMJniolotfy. I- Rliodti. Div»d H. 
IWJITO. WW 168D911a 1963-40. 

^Anesthesia in rtinical ophthalmototy [by] 
Dcrvck Duncalf (and) David H. Rhodes. 
Balttmorev Williams A Wilkina, 1963. Kvm. \u p. 
iiliM. 24 cm. libKMml^r. p. 153- 1*1. I. ^wwrtwii k 
oafcilirfiwoiflty. I. Rliodn. 6^ H.Jota Mdiar. ltU2.DI 

Figure 13. Main Entry Form of Sample Added Entries Shown In Figure 12 

Here we see that the entries do represent the same edition and therefore 
should have consolidated but did not. The question then arises whether to 
indicate in the Location code where the error appears or where it is "caused:" 
whether to Indicate Location 20 or Location 17. The authors wanted very 

O P 
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much to tabulate data both according to the location of error appearance 
and to the location of error source. We wanted to be able to say what 
percentage of the errors appeared in added and subject entries, what percentage 
of the added and subject entries contained at least one error, etc. But we 
also wanted to trace an error to its source location where possible, 
primarily in order to identify which parts of the computer programs could 
best be improved. The dilemma was resolved by breaking the general rule 
of "each error is to be assigned only one code for each of the six general 
categories." Both the source location (i^e. , a main entry code in the 10s) 
and the location of error appearance (either 20 or 30) were recorded in the 
recorded colum. Later, when the data was keypunched, a separate colunn 
was created to handle this situation, and the data reduction program was 
designed to tabulate either of the two columns needed for a specific purpose. 

As with the main entries, when an error could not be traced to its 
precise location either in an added or subject entry or in its corresponding 
main entry, the general category (20 or 30) was entered in the Location 
code. In cases Vhere an extra data field was added to an added or subject 
entry, the location of the error was recorded as 26 or 36, respectively. 

3. Effect 

The third aspect according to which each error was coded is III — Effect. 
Each error has at least one effect, sometimes more than one. In each case, 
however, only one effect was assigned to each error. This rule occasionally 
presented some conflict, but the conflicts were generally reconcilable by 
fairly rational means. 

The first of the Effects, 10 (Lost entry point), is self-explanatory. 
It was used when there would be no way to find an item by means of an entry 
point which should have been available. Instances of misfiling where the 
entry is filed far from the proper filing point could be considered instances 
of lost entry points. However, in our study such errors were coded under 
Effect 20 (Misfiled). Misfiles which are so drastic as to have the effect 
of a lost entry point were grouped with the Effect 10s later in data 
reduction as being instances of "Fatal" errors. 

Effect 10 was used relatively rarely in this study because one planned 
section of the study was not carried out. It was originally intended that 
a sample of the microfilmed source records to be taken and that all of the 
appropriate entry points in these records be looked up in UCUCS. Another 
less thorough way to check for lost entry points would be to look up the 
appropriate added and subject entries for each main entry in our Author/ 
Title sample. This step was not undertaken either. A few lost entry points 
were noticed in the course of our study, however, and we believe that a 
systematic effort to measure this source of error should be undertaken. 

Effect 20 (Misfiled) incltides all entries which have misfiled but in 
which the misfile is not due to failure to consolidate. All misfiled entries 
were categorized according to whether they were misfiled but were within 
their appropriate column (21 and 24), misfiled into another column on the 
same page (22 and 25), or misfiled onto another page (23 and 26). Each of 
these categories was further divided into cases in which the entire entry 
(heading plus body of the entry) misfiled, and those in which only the body 
of the entry (without the entry heading) misfiled. An example of "same 
column" misfile in which the entire entry has misfiled is shown below 
in Figure 14. 29 

^ 20 
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ai trrmm , Wdf^ tiiitli wn Mriu da Lrn^ 

u iiit n it t f i t i i ii Tm . iMa 



00 491 M22Q39 V.41-C 



tetiiiii iiiiiiiir ifuT^ 

00 491 S<22^3«.»-L 



Figure 14. Exaii?>le of Effect 21— Entire Entry Misfiled in Same Colum 

An example in which an entry without its heading misfiled In the same 
colufln is shown below in Figure 15. 



— Thr Emperof Jones, by Eufcae O'NeiU. 
Cncinnttt. Sicwan Kidd compmy clf2l !92l. 
Up. (Slc»vtKkld«odrniplijri,ciL»vT.Sfen) OlOOMO. 

P53529J*StS-tC. 
rV.*TF • P*«y ^o^f ■cti. by EufeM O. 
O'NdH. New York, Bont tnd Uvcrkki, rcl920J 
fA-i'.i**- '•^ 2M#oi3rM5»Aiicriwbua) 

1^9717. fP5SU9NSO«^. 

—The «e«t god Drown. The founttifi. The moon 
of the Ciribbcct, and other phyi. New York, H. 

pc f w« a fa..T1ir«oo«rfttear*fctM^Bowidwfar 
Girtlff.*l^l«i|voni« hM.. hi tW <M. tit.* Whm the 
€nm m Mde.- TV rope. 0001733. P$3529MG7--«& 
--He Emperor ionet. Oifnrent 7hr, mnw. Nev 

^: ; P5 3529NSC5I921--Sa. 



Figure 15. Exanple of Effect 24 — Entry Without Heading Misfiled in 
Same CoIubh 



Effect 30 (Content uncertain) was used whenever an error resulted in 
some confusion or doubt about the exact meaning of some part of the entry. 
Frequently this category was used when a typographical error left the 
meaning of a word or phrase less than certain. "Uncertain" was strictly 
interpreted by the authors. There were many cases where a typographical 
error, for example, might have been interpreted by others as merely affecting 
the appearance of the word or data element because they would have been able 
to guess with some degree of confidence what was meant. We used Category 30 
instead of 80, however, when there was any reasonable chance that the error 
would have been conftxsing to the catalog user. Since the "catalog user" 
was defined as being non-librarians as well as librarians, we tended not to 
give the benefit of the doubt to the catalog in these cases. A typical 
example is shown in Figure 16. (The second entry is probably due to an 
AFR error.) 
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OSTER,GmM.e^ 

--*Phyiicil techniques in biological rciearch. edited 
by Gerald Otter [and] Arthur W. Pollister. New 
York. Academic Press. 1955-64 fv. 6, 19631 6 v 
tiiut 24 cm. Vot. 4-6 edited by WUium L. Nutuk. IiKlodct 
bibliogniphm. CO^^^ENTS 'v. I. Opcicsl iecliniqucs.-v. 2. 
Phytical chonicaJ icchniqun.-v. 3. Ccib and tts»ues.-v. 4. 
Speciat mclhodt -v. Elect rophyiiolofica) methods, pu A-B 
i. Biology. Evpcrimmut. 2. Biolofical apfwriius ind Mipplici 
I Poliiitcr. Arthur W«|g. i^i- ioini ed. II. Nasiuk. William 
L.cd OH324OaS77572 54-ir056 IMOI65 

QH 324 0S--4II: QH324 0BV.1-6-4); 
QH 315 085 l966~S0SrD 

—V. Sec. PHYSICAL lechniqtin in btclogical research. 
New Yofi. Academic Preu. 1955- 

QH3 1 5 P*9^: dM324 08-« BOtOGY LIBRARY: 
QH 324 P578-SO: QT 34 P578-10 



Figure 16. Example of Effect 30 — Content Uncertain 

Effect 40 (Wasted space — not non-consolidation) was discussed earlier* 
It was used whenever an error resulted in the space of at least one additional 
line being used in an entry. This frequently occurred in conjunction with 
an added data element (Type 60) and always occurred when a non-monographic 
entry was included in the catalog (Type 70) . Although errors resulting in 
non-consolidation also result in space being wasted, they were coded under 
one of the non-consolidation categories (50 through 7Z) explained later in 
this section. In measuring the amount of space wasted in the catalog due 
to all kinds of errors, the 40 category is to be used in conjunction with 
the 50-7Z categories to arrive at an estimate. 

Care was taken in designing the error code structure to be able to 
estimate the seriousness of the non-consolidation problems in UCUCS. This 
goal necessitated a very specific and rather complex breakdown of types of 
non-consolidation. Non-consolidation can occur in main, added, or subject 
entries; the type of entry which has failed to consolidate will affect the 
amount of space wasted by that failure. Also, the entire entry may fail 
to consolidate, or just its heading, or, in the case of subject entries, 
the subject heading may fail to consolidate, and this in turn may result 
in failure of entire entries or just entry headings not to consolidate. 

It was assumed that any non-consolidation also represents a deviation of 
sorts, even in cases where the entries are adjacent. In the following 
paragraphs we will attempt to state the meaning of each of the non-consoli- 
dation categories. 

Effect 50 (Non-consolidation of entry) includes several subdivisions. 
One of the 50s Effect Codes is used when the body of the entry should have 
consolidated with another entry but failed to do so. Codes 51, 52 and 5 3 
are used when the entire entry (body plus heading) failed to consolidate. 
(Code 51 is used when the entry should have consolidated with another entry 
in the same column, 52 when it should have consolidated with another entry 
on the same page, and 53 when it should have consolidated with another entry 
on another page.) Codes 54, 55, and 56 are used when an entry without a 
heading should have consolidated; 54 is for the same column, 55 for the 
same page, and 56 for another page. Figure 17 illustrates the use of 
code 51 (entire entry non-consolidation including heading, same column) 
and Figure 18 shows three instances of 54 (only body of entry failed to 
consolidate, same column). 
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OSTEOTOMY at tkt apper cad of tkc feair. 

1965. A^; Milch. Henry. 1I9S'I«M. 

ROS60M630 1965-Sr. 
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1 965. Set. MILCH. Henry. II«}.IH4. 
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OSTEIt Daakl, td 

-^iSuvres campletCi. 1964. ikv. Montesquieu. 
Oiirift Loutt de Secondai. beron de U ^rMe et de. I«I*.ITSS. 

P0201 1 Al 1964-0. 
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Clwrkt LouM de Secondai. bafon de la 8rMe c« 4«. IMf*l7SS 

PQ 2011 Al 1964— SO. 
— OEuvrcs completes. Pref. de Ccor|ci Vedel. 
Preientation et notes de Diniel Oster. New York 
Micmillan 1964. hit Mm. WbHoffrapWcallooiBoica. 
t. Moalnquieu. Cliarks Loiib de Socoodal. baron de la ire. 
II. de c« de. 141^1799. 0003M5. 

PQ2011 Al 1964-^ 



Figure 17. Examples of Effect 51 — 
Entire Entry Non-Consolldation, 
Same Coluim 



Figure 18. Examples of Effect 54 — 
Only Body of Entry Non-Consolidation, 
Same Column 



Effect 60 (Non-consolidation of heading only) has the same subdivisions 
as Effect 50. The 60 codes are used when the body of the entry did not con- 
solidate correctly (i.e., the entry represents a different work or edition 
from any other on that aample page), but the entry's heading should have con- 
solidated and failed to do so. Even if the entry associated with a particular 
non-consolidated heading should not have consolidated, it will still be mis- 
filed as a result of its heading failing to consolidate. Therefore, the 60s 
codes Indicate whether the entry is misfiled onto the same coluim, 6ime 
page, or another page from where it should be filed. The first three (61, 
62, and 63) are used for entries with headings "attached" which have been 
misfiled because of non-consolidation of headings. See Figure 19 for an 
illustration of a 61. The second three (64, 65, and 66) are used for 
entries which consist only of an entry body, but whose headings (actually 
appearing with an entry filed ahead of them) failed to consolidate. Again 
an example is needed to make this Intelligible; Figure 20 shows an Instance 

of a 64. KING, Eiw»4 Jvpar. If I6> 



KING, Enaat Jaaeph, ll7t- 

—...Ufiiced Sutea navy al war. 1945. &«rU. S. 
OMtm^mvnloftatlam. ' D773Ari94MR. 

—The War reports of Ocneial of the Anny 
Oeerte C Marshall, Ouef of Staff. Ocncral of the 
Army H. H. Arnold, Comoundtng GctMraJ, Army 
Air Forces. 1947. SmtTUb. D769WS-ii. 
KING, ErMat fmpt. lITt-lMC 
^Fket Admiral Kln|, a naval record by Ernest J. 



aneydaw^ of payiU cto 
(OuMpMc IK iMMff 



— Actd-tese equilibria, by Edward J. Kin$, [1st 
edl oxford. New York, PerfaraoQ Press 
[piMbad in the Western Hemiiphcfe by 
Macmillan. New York (19651 &l.34lpi Mm. 34c«. 
(The laterMdooal ■ ■ c T tl Miaa cfakyatey l iii l i l i j mA 
fhinriral Hryika. Topk iSt ffurfrfiii pwpawtaa af 
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iS47M0 
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R g04S3 AS topic IS t.4-« PTiys.Sci; 
II (yS49S »• lopfc 1 9L V.4--II PhM. U4 
Q0 4S3ls7!Sr4-ii; g0463» 

^ qosoiKBt-^DsTa 
M^uaiicativt analyaia and ciec to olyt k a e hi tio os . 
Under the acneral ediiarMp of Urfchi H. 
FarinholL New Yort, Htftowt Bnce [19391 64i 

QDCl.K4SS4C199-773r«CD t«3t3M. 

KING, Kdvwi tayw. If 16* Jatet mUmt. 

-^OeMnJchcmbtry. 1967. teljji^M^ 



Kint and Walter Mutr WhitehUL (Islcdl New 
YocL W. W. Norton [19521 

24 cm. 1. UM Smm. Navy^-Hhiory. 2. 



XV. 674 p. 



. i«ia> QD 

Figure 19. Example of Effect 61 — 
Non-Consolidation of Heading Only, 
Resulting in Mlsfile of Entire 
Entry, Same Column 



War. I Wm>-Nwl npitluM. Aaarkm i. 

Walwr Mirir. IW5- iotal artlwr. lltlK33 fllSli U.1S493 
UC« l)nill. ClttNAS-^ t 1S2KS--«K 



Figure 20, Exanq)le of Effect 64 — 
Non-Consolidation of Entry Heading 
Only, Resulting In Mlsfile of Body 
of Entry Only, Same Columa 
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It may be noted that the lack of interfiling of main and added entries 
for the same name is actually a filing policy or practice, not an "error." 
We can say it is, in UCUCS, a deviation from a desired practice, but the 
desired practice was not implemented because, by management decision, the 
data was not encoded to accomplish interfiling. 

Effect 70 (Non-consolidation of subject heading) contains many subdivi- 
sions. The complexities of the above two general non-consolidation categories 
(50s and 60s) pale by comparison to the 70s category. This category combines 
all the previous possibilities with the non-consolidation of an entry's subject 
heading. There are 22 possible codes when a subject heading has failed to 
consolidate. These codes are first divided into two groups: those in which 
the space taken up by the subject heading itself should be included in this 
particular error coding, and those in which the space taken up by the 
subject heading should not be included in the code. To make clear why this 
division is necessary, note Figure 21 below, in which the cause of the 
problem is the failure of the authority control software. 



SHAKESPEARE. 
WILLI AM-BIOG.-CHARACTER. 
ARMSTRONG. Edward Allworthy. 

— ShukcspcarcS imagtnulKm; a sludy nl ihc 
n«.vchi)ii>s> nt iiNMtcialiun jnd inspiraliiin. 1%.^ 

PR30BI.A7 1963-SC 
MCCIIRDY. Harold Crier. 1909- 
— The pcrsnnalily nf Shakespeare, a \emure in 
p^ychnlitgieai meihiKJ l>)53. PR2909M?— SC 
WILSON. John Dover, 1881- 
— The cssenlial Shakespeare; a biographiLjl 
aJ\enlurc. 1946 PR289«W57 19«6— SC 



SHAKESPEARE. WILI IAM—BIOGRAPHV 
—CHARACTER. 
ARMSTRONG. Edward Allworthy. 

— Shakespeare's imaginaliim 1 96.1. 
PR 3081 A73s 1963-L. 93?r A734 sha I963-B 
PR308I A7 1963-0. PR 3081 A7~IR 
BAGEHOT. Waller. 1826-1877. 
—Shakespeare, the man 1901 

PR ?895 83 Siack-SB 
BEECHING. Henry (.liarle^. 1859-1919. 
— The eh J racier til Shjkespeare. 1917. 

PR2899 B44-R. PR ?899 B«-SB 



Figure 21. Example of Subject Heading Non-Consolidation 

In this illustration, the first subject heading failed to consolidate 
with the second subject heading which appeared in slightly different form. 
This fact is reflected in coding each of three entries appearing under 
the first subject heading. But the amount of space wasted by the non- 
consolidated subject heading should be tallied only once. In such cases, 
then, the first entry listed under a non-consolidated subject heading was 
coded to include consideration of the amount of space wasted by the subject 
heading, and subsequent entries under the same subject heading were tallied 
so as not to include the wasted space of the subject heading. 

In each instance, after deciding whether the subject heading space 
should be considered in coding an entry in error, the next step is 
analogous to determining the appropriate 50s or 60s code as discussed 
previously. One needs to determine exactly what part(s) of the entry (if any) 
have failed to consolidate and whether the entry was misfiled on the same 
column, same page or another page. Codes 71, 72, and 73 are used when no 
part of the entry (except its subject heading) should have consolidated, 
so that it has misfiled only because its subject heading was inappropriately 
duplicated. Codes 74, 75, and 76 are used when not only the entry's subject 
heading but also the entry heading failed to consolidate, but the body of 
the entry correctly did not consolidate. Codes 77, 78, and 79 are used 
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when not only the subject and entry heading but also the body of the entry 
should have consolidated. The nine codes listed above are only used when 
the space wasted by the non-consolidated subject heading is to be counted; 
that is, they are used only for the first entry appearing under a particular 
subject heading. 

Entries in which the subject heading space is not to be considered-- 
that is, second, third, and subsequent entries under a subject heading-- are 
coded with one of the 12 alpha-nurjeric 70s codes. Here, in cases where 
only the subject heading itself was supposed to have consolidated (analogous 
to the 71, 72, and 73 above), a separate division must be made for cases in 
which the entire entry (heading plus body) misfiled as a result of the 
subject heading non-consolidation and for those cases In which only 
the body of an entry misfiled. Figure 22 shows an instance of an entire 
entry misfiling because of its subject heading failing to consolidate. 



SHAKESPEARE, 
WILLIAM->BIBLIOGRAPHY--FOLIIOS. 
1623. 

SHAKESPEARE Assoditioii. LomJoh. 

— Studies in the first folio, written for the 
Shakespeare Aiuociation in celebration of the first 
folio tereefi>c^«ry and read at meetings of the 
Avsociation hdd at Kxni'h Cdlcge. University of 
London. May-June. 1923. 1924, 

Z8813 ss-sa 

SHAKESPEARE. WILLIAM— BIBLIOGRAPHY 

—FOLIOS. 1623. 
COLE, Gcontc Watson. I8S0-I939. 

— The first folio of Shakespeare. 1909. 

Z 1008 B47pv3-i 



Figure 22. Example of Effect 7A— Subject Heading 
Non-~Consolidation Resulting in Entire 
Entry Misfile, Same Colum 



The other six codes in this category are used when either the entry 
heading should have consolidated but did not (codes 7P, 7Q, 7R) , or when 
the entire entry failed to consolidate (codes 7X, 7Y, and 7Z) . An illus- 
tration of each type of case is shown in Figures 23 and 24. 



EISENHOWER, DWIGHT DAVID. PRES. U, 
S^ 

ARMY Timtg, WMMagtoii, D.C 

—The ehallente and the triumph; the story of 
General Dwight D. Eiienhower. by the e<fitors of 
the Army timet. 1966. E836i^87-^. 



SHAKESPEARE, WlLLIAM^BIOG. 
ADAMS. Joscpli QMincy. 1881* 

— A life of William ShaKexpearc. by Library 

ed. 1925. Pf)2894 A3 1951-4C 

ALEXANDER. Petir. 1893- 

— A Shike^pcarc primer. |9M 

PR2895A4 |q6l-SC 

SHAKESPEARE, WILLIAM— BIOGRAPHY. 
ADAM.S. i9%eph Qniiiey. 1881* 
—A life of William ShaLcxpcare. 1951. 

PR 2894 A3 195I-S8 
ALEXANDER, Peter. 1893* 
— A Shakespeare primer. 1951. 

Pff2895 A4 3 I951a~ir 



Figure 23. Example of Effect 7Q— 
Subject Heading Non-Coasolidation 
Resulting In Entry Heading Non- 
consolidation, Same Page 



EISENHOWfeU, DWIGHT DAVID. PRES. U. 
&, 1890-. 
ADAMS. SlmMii, 1899* 

— Firsthand rnxift. 1961. 

E835 A33-^: E 835 A4 I961--Sa. 

ALBERTSON. Den. 1920^. 
— Eisenhower as President. 1964. 
E 836 A8 1964-^ e 836 AA 1964; E836 A6&--«, 
E 836 A33«-4; E 836 A4-n. 

AMBROSE, Stcitei E. 

—Eiienhower and Berlin. 1945. 1967. 

D 755.7 A4»--llt D755.7 A4S-«: 
D 755.7 AS-tt. 



ARMY ttaMf, WMklMtoii, D. C 

-The ehallceifc and the triumph. 1966. 

C836A72--V: E 836A87-«. 



Figure 24. Example of Effect 7X — 
Subject Heading Non-Consolidation 
Resulting in Entire Entry Non- 
consolidation, Same Column 
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Category 80 was used when the only result of an error was some adverse 
effect on the appearance of the catalog* For example, it could have made 
the entry difficult to read or just unattractive. But if the error might 
conceivably have resulted in some confusion on the part of the user regarding 
the content of the entry, Code 30 was used instead of Code 80. An example 
of a number of instances of Effect 80 appears in Figure 25. 



Figure 25. Examples of Effect 80 — Appearance Only 



The 90 code for Other or unknown was rarely used. Its primary use was 
in conjunction with program processing errors which printed out unnecessary 
data fields but which resulted in no wasted space in the catalog. (Type 60, 
Effect 90, Cause 30.) Note the example in Figure 26. 



Figure 26. Example of Effect 90 — Other or Unknown 
4. Cause 

The fourth aspect by which each error was coded is Cause. Whenever 
possible the cause of an error was determined and coded. It was rarely 
possible to determine with absolute certainty the cause of a particular 
error, although usually we could be fairly confident. When we suspected 
the validity of our opinion of an error's cause, we coded Cause 70 (Unknown). 
In general, though, we attempted to assign each error a specific cause. 

Cause 10 (Keying error) is self explanatory. This code includes all 
those mistakes that we normally think of as typographical or keyboarding 
errors. 

Cause 20 (Variant cataloging practice) includes all instances where a 
discrepancy in the way two or more campuses (or even libraries within the 
same campus) cataloged a given work resulted in some deviation or inconsistency 
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in the catalog. Frequently, some minor discrepancy would result in non- 
consolidation of records; occasionally variant cataloging practice resulted 
in misfiling of a record. Such cases were counted. But instances in which a 
discrepancy in cataloging happened to be noticed but did not result in error 
were not tallied. 

Cause 30 (Program processing) includes all instances in which an 
error was caused by the failure of a program to do what it was designed 
to do. This could have been any of the various programs used to produce 
UCUCS: automatic format recognition, generation of added entries, the 
print program, or whatever. 

Cause 40 (Processing other than keying or programming) was coded when 
there was apparently some slip-up in processing which was not program 
processing or keyboarding. This code was used primarily for instances of 
non-monographic material being inctuded in the catalog. 

Cause 50 (Inadequate design) was used to cover cases in which a program 
should or could have been written to handle a particular type of situation. 
For example, when an English article is in^roperly used in filing, the error 
was coded Cause 30 (Program processing) because a program was written to 
prevent such occurrences; the program apparently failed in this particular 
case. But when a non-English article is iiiq)roperly used in filing, we coded 
for Cause 50 (Inadequate design) since no program was written to suppress 
these articles in filing, and conceivably one could have been written to 
handle such situations, given sufficient time and budget. 

Cause 60 (Other) was to be used when the cause did not fit any of the 
above categories but was known to us. It turned out that Cause 60 was 
only rarely assigned in this study. Cause 70 (Unknown) was used when we 
could not determine the cause of the error within the realm cf reasonable 
doub t . 

5. Language 

The fifth aspect, V — Language, is self explanatory. The language 
in which the^ error itself was found was the language which was coded. 
That is, if an error occurred in a title of a work and the title was 
written in French, the code used was 30. But if the error occurred in a 
subject tracing which was written in English, the code used was 10, even 
though the title and the work itself may have been in French. In those 
cases discussed earlier in which the error cannot be pinpointed to a 
specific location in the entry, the Language code is used which reflects 
the language of the title of the work. 

6. Non-Monographic Type 

The final aspect is VI — Non-monographic type. The first item in this 
category, 10 (Monographic series), does not inqply that monographic series 
should not be included in the catalog. They should have been and were 
included. The category is given here to evaluate errors which happen to 
occur in monographic series records. It was thought that different error 
patterns might emerge in such records. Thus no error was recorded just 
because an entry happ<?.ned to be a monographic series, but if an error 
occurred in such a record it was coded 10 in this category. 
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As mentioned previously, only monographs were intended by the system 
designers to be included in this catalog. Serials, music scores, maps, 
phonodiscs, etc,, were intended to be excluded, but such records were not 
always successfully omitted. When such a record was found, it was coded 
here either 20 or 30 depending on the type of entry. This category was 
used only to code non-monographic entries; for ordinary monographic entries 
a dash was placed in the Non-monographic type category. 

?• Comments 

The final colunn in the Error Coding Data Sheet was reserved for 
comments. It was used, especially in the early part of the study, for 
noting problems or questions which needed to be discussed and resolved 
by the two authors doing the detailed review. It was also used to record 
multiple instances of the same type of error. For example, if there were 
four typographic errors in the title of an entry with the same cause, 
effect, etc., the first instance of the error would be coded in the six 
categories and "X 4" would be noted in red in the comments column. In this 
way the keypuncher was alerted simply to punch one card and duplicate it 
three times. 

The "Comments" column was also used to record that an entry was from 
the Santa Cruz campus only. The Santa Cruz records were sent to ILR in 
machine readable form, unlike the rest of the campuses which sent catalog 
cards. Since certain types of errors were observed by the authors to occur 
exclusively or nearly exclusively iu Santa Cruz records, we decided to note 
all errors in Santa Cruz records by writing "SC" in the "Comments" colunn. 
As many of these errors may be due to the fact that the records were already 
in machine-readable form when ILR received them, it was felt that recording 
Santa Cruz record errors would facilitate in?)roving the separate programs 
designed to reconcile these records with the rest of the file. 
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D. METHOD OF CONVERSION OF COLLECTED DATA INTO MACHINE READABLE FORM 



FOR TABULATION 



The data were carefully collected on code sheets such as the one shown 
earlier in Figure 2, and then punched onto IBM cards in the order in which 
they appeared on the sheets. One card was used for each error. Each card, 
then, started with a two-digit voluine nuiober, a three-digit page nuirijer, a 
one-digit colunn number, and a two-digit entry number. (Numbers in each 
category which had fewer digits were right justified and zeros were placed 
ahead of them.) The resulting eight-digit number represented a unique iden- 
tification number by which an error could be relocated in the catalog. 

The codes for the six major error categories then appeared consecutively 
in card columns 9 through 18. Card column 30 was used to record that an 
error was located in a Santa Cruz entry (as the number "1") ; column 30 was 
left blank when the record was not from Santa Cruz. 

In the section above describing the second error aspect, location, it 
was mentioned that in some cases two Location codes had to be recorded. This 
occurred when the place where the error was noticed (in an added entry, for 
example) was different from the source of the error (in a part of the main 
entry which did not appeaif in the added entry — for example, the publisher 
statement). In coding such errors, both types of location were noted in the 
Location column of the data sheet. But since both Location codes could not 
be punched in the same colunn of an IBM card, a separate part of the IBM 
card was used for one of the Location codes. It was arbitrarily decided 
that the place where an error appeared (that is. Location code 20 or 30) 
would be punched in the regular Location code column and that the source 
of the error (that is. Location codes lA through 19) would be punched in 
columns 35 and 36 of the IBM card. Accordingly, the data sheets were all 
re-scanned and multiple entries in the Location column were erased. The 
code 20 or 30 was written in the Location column, and the source location 
code was noted in the center of the Comments colunn. 

One other modification of the recorded data was necessary before key- 
punching the data was possible. Since the program used to tabulate the data 
could manipulate data only in numeric form and since some of the codes used 
in the study were alpha-numeric, these had to be changed to numeric form before 
keypunching. Accordingly, the Location codes lA, IB, IC, ID, IX, and lY were 
erased on the data sheets and replaced with 41, 42, 43, 44, 45, and 46 
respectively. Alpha-numeric codes had also been used in the third error 
aspect. Effect. Changing the alpha-numeric codes to numeric form was a 
little more problematic here since nine major categories were used (numbered 
10, 20, ...90), and we did not want to use the same first digit for categories 
which were conceptually unrelated. That is, we did not want to number instances 
of Effect 70 (Non-consolidation of subject heading) using the first digit 
of another category (31, 32, etc., for example). The problem was resolved 
in the following manner: the data recorded as Effect 7A, 7B, etc., were 
re-coded as 70 in the Effect colunn of the data sheets, and a separate 
column was created in the Comments colunn. The data in this part of the 
Comments column was keypunched in columns 38 and 39 of the IBM cards. This 
data consisted of the digits 01 through 12. By way of illustration, data 
originally coded as 7A and 7Y in the Effect column of the data sheet were 
recorded there as 70 and in the Comments colunn as 01 and 11, respectively. 
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All keypunching — except approximately 100 cards keyed and prcx)fread 
by one of the authors — was done by the U,C, Berkeley Computer Center key- 
punchers. The punched cards were key verified, except for the hundred or 
so which were manually proofread. That is, approximately 98.5% of the 
data cards were key verified. 

E. SUGGESTED IMPROVEMENTS IN THE METHODOLOGY 

Hindsight is usually better than foresight, and carrying out this 
project proved to be no exception to that general rule. Hindsight has 
dictated a number of suggestions for improving our methodology. These will 
be discussed here so that anyone attempting a similar study can incorporate 
them into their methodology. 

A shortcoming of the present design is that there is no allowance 
made for recording whether an entry was a title main entry or corporate 
author main entry. After most of the sample sheets had been analyzed,, 
we began to notice that certain kinds of errors seemed to appear more 
frequently in title main entries than in author main entries. But this 
is merely a subjective impression, and we did not have time to re-design 
the error categories and re-examine the sample pages in order to record 
the necessary information. It should not be necessary to designate 
a separate, i.e., seventh, major category in order to record such infor- 
mation as the errors are described on the data sheets. One way of in- 
cluding the information would be to circle or xmderline the entry in the 
Location column if the error occurred in a title or corporate author main 
en try • 

In this study. Cause 60 (Other) was rarely used. Probably a combined 
category for "Other or Unknown" would suffice for most studies. 

There was no attempt to estimate the number of lost entry points in 
the catalog. In a few cases the analysts stumbled upon such instances, 
but there was no systematic attempt to discover the probable rate of lost 
entry points. Such a systematic attempt might be made by taking a random 
sample of the source records for the catalog and singly looking up all the 
entry points in the catalog indicated by the record. Where this is not 
possible, just checking whether all the entry points indicated by the main 
entries in the sample data sheets are in the catalog would give some idea 
of the number or extent of missing entry points. Of course, if the main 
entry itself were omitted from the data base, both it and all its added 
entries would be lost, and there would be no way of recording such oc- 
currences if the latter method is used. For this reason, checking from 
source records is obviously a better method, 

A fourth suggestion is to enq)loy multi-lingual analysts if possible. 
One of the present authors has a reading knowledge of Spanish and Portuguese. 
Many other foreign langiaage entries (152 of them) had to be read by other 
people familiar with these languages. This is time-consuming since one of 
the analysts had to go over the entries with these people to make sure 
the errors were coded consistently. If the authors had been able to read 
more of the language found in the catalog, this part of the study would 
have been accomplished more efficiently. 
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A very simple improvement would be to add a subcategory in the Type 
30s (Orthographic inaccuracy) to include all instances of incorrect ac- 
cent marks, umlauts, and other such characters. In the present study 
these errors were coded 37 (Other misspelling or undetermined), along 
with a wide variety of other kinds of errors. 

Finally, an error in' our procedure was the failure to have re-keyed 
(and then key verified) the hundred or so cards punched by one of the 
authors and manually proofread. There are some discrepanies in the tab- 
ulated results of the study which may be due to keying errors in this 
relatively small batch of punched cards which were not key verified. 
Some of the discrepancies (or possibly all of them) may result from other 
causes. Over 92,000 digits and letters were handwritten on the data sheets 
and then keypunched, so it is likely that some characters were illegibly 
written and misread by both the keypuncher and the key verifier. More- 
over, errors could easily have resulted when the data sheets were re- 
scanned and some of the data re-coded in order to suit the requirements 
of the data reduction program. Since nearly all the discrepancies are 
in the subject catalog section (where most of the re-coding was done), it 
seems likely that this factor contributed to the error. In any case, all 
the punched cards should have been key verified, not just 98.5% of them. 
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IV. RESULTS 



A. INTRODUCTORY COMMENTS 

Once the data had been keypunched and the computer program most appro- 
priate to our needs had been identified by the programmer, it was relatively 
easy to generate tables correlating the various aspects of the errors recorded. 
We attempted, therefore, to produce all those tables which might be of in- 
terest to people associated with the UCUCS project or to people who might 
want to compare our results with those of similar future studies. The 
program was used to generate 48 tables, as well as other data such as the 
total nunier of errors in the Author/Title catalog and in the Subject catalog. 
All of the data produced by these programs are included in this report. 
Some of the data is introduced and discussed in this section of the report, 
namely those tables and results which seem most likely to be of general 
in teres t. 

It was noticed by the authors early in the study that when one 
error was found in an entry, chances were that another would be found in 
the same entry. That is, many entries had no error, and it seemed that 
those which contained one error often had more than one. This was an 
interesting subjective observation, and it was therefore hoped that the 
program used in data reduction would be capable of determining the aver- 
age number of errors per entry of the entries in error . The page, 
volume, colunn, and entry numbers of each error (together representing 
an entry identification number) were ,':e corded, so theoretically it would 
have been possible for the program to store this information for each 
error, to record the number of errors associated with each of the entries 
which contained some error, and then to find the average number of errors per 
entry of the entries in error. But this feat was beyond the ken of 
the data reduction program used for this study. 

As mentioned earlier in the study, it was noticed during the priHe^ss 
of data collection that certain error patterns seemed to appear in records 
with a location code indicating they came from tho. Santa Cruz campus, the 
only campus which sent records already in machine -readable form and which 
had a different processing procedure from the remainder of the records. 
Therefore, errors found in Santa Cruz records were so noted on the data 
sheets. It was possible, then, to generate tables correlating any error 
aspects for errors occurring in Santa Cruz records just as it was possible 
to generate such tables for the entire body of daca. All tables produced 
were therefore done for the entire body cf data and also for the subset 
of data from the Santa Cruz campus. 

In addition to the tabulation of error data peculiar to SartCa Cruz 
entries, all of the tables presented in this report are generated for both 
the Author/Title catalog sample and for the Subject catalog sample. That 
is, each correlation of two error aspects — for example, error type and 
cause — appears in segregated tables for the two parts of the catalog. 
Moreover, as explained above, each correlation is also divided according 
to the data for the entire sample and for the Santa Cruz records only. 
Therefore, each correlation of error aspects appears in four separate 
tables: one for the entire Author/Title catalog, one for the Santa Cruz 
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records in the Author/Title catalog, one for the entire Subject catalog, 
and one for the Santa Cruz records in the Subject catalog. Thus, while 48 
tables were created, only 12 actual correlations of error aspects are 
presented. 

In considering the reported error rates, we should keep in mind that 
152 entries of the sample of 5,900 (2.6%) contained some foreign language 
words which we could not analyze and which were not thoroughly inspected 
due to the unavailability of people who could read those languages. 

It should also again be mentioned that there are some discrepancies 
in the tabulated results of the study which may be due to keying errors 
in the punched cards which were not key verified or to the re-coding of 
some of the data. Almost all of these discrepancies appear in the Subject 
catalog data, where the totals of some of the tables vary between 1,194 
and 1,171. The few variations in totals in tables for the Author/Title 
catalog are no greater than 4. Most of the discrepancies are of little 
statistical significance. The totals in the tables presented here, there- 
fore, reflect those discrepancies; percentages given in the tables are 
percentages of the total given in that table. 

B. DISPLAY AND DISCUSSION OF FINDINGS FOR THE AUTHOR/TITLE CATALOG 
1. Summary of Error Rate 

The absolute numbers of the errors found in this study are noted in 
the tables. 



NUMBER OF ERRORS 





FATAL 


SERIOUS 


MINOR 


TOTAL 


Author/Title Catalog 


141 


1,396 


1,630 


3,167 




(4.4%) 


(44.1%) 


(51.5%) 


(100.0%) 


Subject Catalog 


159 


490 


522 


1,171 




(13.6%) 


(41.8%) 


(44.6%) 


(100.0%) 


TOTAL 


300 


1,886 


2,152 


4,338 




(6.9%) 


(43.5%) 


(49.6%) 


(100.0%) 



TABLE 1: TOTAL NUMBER OF ERRORS FOUND IN THE SAMPLE 



The estimated catalog error rates can be computed from the above data 
and the sample size data given earlier. This results in the data shown in 
Table 2. 
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ERROR RATE 



FATAL SERIOUS MIHOR TOTAL 

Author/Title Catalog 

Errors Per page 2.3 22.9 26.7 51.9 

Errors Per ^:ntry 0.04 0.39 0.45 0.88 

Subject Catalog 

Errors Per page 4.8 14.8 15.8 35.4 

Errors Per entry 0.07 0.21 0.23 0.51 

TOTAL 

Errors Per page 3.2 20.0 22.9 46.1 

Errors Per entry 0.05 0.32 0.36 0.74 

TABLE 2: SUMMARY OF COMPUTED ERROR RATE 

2. Causes of Error in the Author/Title Catalog 

Figure 1 and Section III C. 4. in this report listed and discussed the 
causes of error that were considered for this study; all errors were attributed 
to one of these categories of causes. The gross distribution of total errors 
(fatal, serious, minor) by cause is given in Table 3. 

Table 3 shows the errors in the Author/Title catalog sample aranged 
according to error type and cause. In each cell of the table we find the 
number of errors found of a certain type, with a certain cause. For example, 
we see in the firsn horizontal row that no errors were found representing 
"duplicate data not suppressed" which were caused by keeping errors or variant 
cataloging practice, but 9 were found (not surprisingly) due to program pro- 
cessing failures, and 12 were found due to unknown causes. In all there 
were 21 errors found in this type category, 43 of which were due to program 
processing and 57% of which were due to unknown causes. 

Some types of errors had consistent causes. For exan?>le, all 30 of 
the transposition errors (Type 31) were due (again, not surprisingly) to 
keying errors, and all 50 of the instances of non-English articles being 
used improperly in filing were due to inadequate design. (No attenpt was 
made in the UCUCS programs to disregard non-English articles in filing. 
Since filing errors resulting from non-English articles could have been 
suppressed for most languages, these filing errors must be ascribed to 
inadequate design rather than program processing) . We can also see from 
this table that there were 215 instances of inappropriate entries found 
in the sample (all due to processing other than programming) • 
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Our study delineated six ways in which character strings could be 
improperly used in filing: as function terms (e.g., editoi, trans.), 
dates (author or editor dates), associated titles (e.g., Mrs., 1st 
baron), English articles, and non-English articles, and "Other," en- 
compassing errors not fitting neatly into any of the previous five 
categories. In all, 311 instances were found of strings improperly 
used in filing. Function terms and dates were the kind of strings 
most frequently misused in filing. But there were more than 311 in- 
stances of misfiled entries found in the Author/Title portion of the 
sample. Most of the error type categories as listed in this table can 
be associated with misfiling of entries, either by causing an entry to 
appear in the wrong filing position or by causing an entry to fail to 
consolidate with another entry and thereby also filing in the wrong 
place. 

We can see from this table that keying errors caused the largest per- 
centage of errors (A5.5%) found in the Author/Title catalog sample. The 
next highest percentage (20.3%) was contributed by program processing 
failures . 

It is of interest here to compare the analogous data given in Table A 
for the Santa Cruz records (errors arranged by type and cause). 

Here we see that the highest percentage of errors in the Santa Cruz 
part of the sample was contributed by failures in program processing. It 
is impossible for us to say which programs were responsible for the errors. 
Errors could have occurred in the programs which generated the Santa Cruz 
tapes; they could have occurred in the programs which attempted to carry 
out any of the operations performed on the entire file (such as sorting, 
consolidation, generating added entries, etc.). It seems likely, however, 
that most of the program failure in the Santa Cruz records occurred either 
in the programs used to produce the tape which was sent to ILR or else in 
the programs which attempted to merge these records into the rest of the 
UCUCS file. This is a logical conclusion since 531 of the total 6AA program 
processing errors in the Author/Title catalog occurred in Santa Cruz records. 
That is, 82.5% of all the program processing errors found in the Author/ 
Title sample occurred in Santa Cruz records. 

Comparing Tables 3 and A we can also note that of the 252 errors 
ascribed to "Unknown" causes, 160 or 63.5% occurred in Santa Cruz records. 
This reflects the subjective observation by the authors that rather bizarre 
and inexplicable errors occurred more frequently in these records. 

Finally, it should be noted that 1,211 of the 3,167 errors found 
in the Author/Title sample were found in records from Santa Cruz. This 
represents 38.2% of all the errors found in the Author/Title catalog 
sample. Analogous tables for the Subject catalog sample are presented in 
Section C. From these tables we can see that 30.1% of the errors found in 
the Subject catalog sample were found in Santa Cruz records. For the 
entire sample of A, 361 errors, 36.0% (1,570) occurred in Santa Cruz records. 
Santa Cruz contributed a total of 122, 2A0 titles (representing 16.3% of 
the UCUCS titles, and 11.7% of the UCUCS records), and hence one would 
expect the errors from UCUCS processing to be distributed over Santa Cruz 
records at about that same proportion. 
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We can also see that a high proportion of those types of errors called 
"orthographic inaccuraciss" appeared in the Santa Cruz records. For example, 
there were 769 instances of missing strings found in the entire Author Title 
sample- 513 (66.7%) of these appeared in Santa Cruz records. There were 
597 instances of missing or added blanks in the entire Author /Title sample; 
181 (30 3%) of these errors were contributed by Santa Cruz records. The 
total of all types of orthographic inaccuracies for the Author/Title sample 
was 2 331 (by far the most frequent general error type— 73.6% of all error 
found in the Author/Title catalog sample). Of these 2,331 errors, 43.5% 
(1 014) were found in Santa Cruz records. As will be discussed below, most 
orthographic inaccuracy errors were relatively minor, usually affecting 
only the appearance of the entry. However, some had more serious effects, 
such as non-consolidation, misfiling of entries onto other pages, and 
making the content of the record uncertain. 

3. Seriousness of the Errors in the Author/T itle Catalog 

Let us now consider the question of the severity of the errors 
found in our sample. We divided the errors into three categories ac- 
cording to the effect each error had. Of course, some errors had more 
than one determinable effect, but since only one effect could be "re- 
corded for each error, the most serious effect was chosen when there 
was a choice. 

Minor errors included only three effect categories: wasted space 
(the error had no effect more serious than wasting space in the catalog); 
appearance (the error affected only the appearance of the entry); and 
other or unknown. The other or unknown category was included in the minor 
errors because we believed that the categories for the more serious kinds 
of errors had beeti carefully enough defined so that little, if anything, 
had been left out. Table 5 displays the minor errors according to their 
causes and effects. Here we see that there were 1,630 "^""^^"""^^-^ '^^^ 
Author/Title catalog sample, or 51.5% of all the errors in the Author/Title 
segment of the .sample. Most of these errors (902 or 55.3%) were caused by 
keying errors, 18.7% were caused by program processing, and 13.4/. were 
caused by record processing other than keying or programming. 

From Table 6, displaying the data for the minor errors in the Santa 
Cruz records in the Author/Title segment, we can see that 665 or 40.8^ ot 
the 1 630 minor errors found in the entire Author/Title segment were found 
in Santa Cruz records. For the Santa Cruz portion of this segment, 
program processing caused nearly as many minor errors as did keying 
mistakes: 274 (41.2%) were caused by keying errors and 246 (37.0/.; were 
caused by program processing. 

The serious errors included just two general types of error: those 
which resulted in uncertainty of the content of the record, and those 
which resulted in the misfile of the record in the same column or same 
page (that is, the record was misfiled but still appeared on the same 
page where it was supposed to appear). Both of these types of errors 
can, in some cases, prevent a user from finding a needed item. But these 
errors really represent differing degrees of seriousness. Misfiling of an 
entry onto another column of the same page is more likely to result in the 
user missing that entry point than is misfiling within the same colunn. 
particularly if the needed item appears immediately adjacent to the entry 
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point where it should be. And since we interpreted the category '^content ^ 
uncertain" so strictly that errors would be so categorized if they presented 
any doubt about the meaning of the word or data element, many errors 
tallied under this category might better have been included with the minor 
errors. 

Placing these errors in the serious category was aii arbitrary choice 
in keeping with the operating rule of ' "When in doubt, don'u give the benefit 
of the doubt to the catalog." The summary of all serious errors found in 
the Author/Title sample is presented in Table 7. As noted in Table 1 serious 
errors represent 44.1% of the errors in the Author/ Title segment of the 
sample. Most of the serious errors were caused by keying errors (509 of 
them, or 36.5%). Variant cataloging practice and program processing 
nearly tie for next most frequent cause (23.4% and 23.9%, respectively). 
We notice that almost half of all the serious errors recorded had the 
effect of content uncertain (656 of 1,396, or 47.0%). Given the arbitrary 
placement of this category into the serious error group, it is useful to 
consider what the results would be if the errors tallied mder "content 
uncertain" were omitted from the table. Without the content uncertain errors, 
there would be a total of 740 serious errors in the Author/Title segment, or 
23.4% of the 3,167 errors in this segment of the sample. Also, deleting 
these errors from the tabulated data changes the relative frequency of 
causes of serious errors. Of the 509 serious errors caused by keying 
errors, 312 would be deleted, leaving 197 serious errors caused by keying 
mistakes. This figure represents 26.6% of all these errors. The variant 
cataloging practice errors would have nearly the same total as in Table 7 
but a higher relative frequency: 319 instead of 327, 43.1% instead of 
23.4%. Program processing would then cause only 71 of the serious errors 
and would be the cause of serious errors only 10.0% of the time instead 
of the 23.9% given in the table. 

In summary, then, if those errors having the effect of content 
uncertain were considered minor instead of serious, the most prevalent 
cause of serious errors would be variant cataloging practice rather than 
keying errors, and program processing would have caused only 10% rather 
than nearly 24% of the serious errors. We won't list details of the 
effects of adding these content mcertain errors to the minor error 
tables. The interested reader can do that easily enough. It does seem 
worth mentioning, however, that including the content uncertain errors 
with the other minor errors would increase the relative frequency of minor 
errors in the Author/Title segment of the sample from 51.5% to 72.2%. The 
serious and fatal errors together would then equal 27.9% rather than 48.6% 
of the Author/Title segment of the sample. 

The Santa Cruz portion of the sample accounts for 522 or 37.4% of the 
total 1,396 serious errors in the Author/Title catalog, as is shown in 
Table 8. The sample also accounts for over half of the 656 serious errors 
in the Author/Title catalog which result in uncertain content, and for 
about 85% of the serious errors in the Author/Title catalog which are 
caused by program processing. Over half (54.2%) of the serious errors in 
the Santa Cruz sample were accounted for by program processing, with keying 
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errors (19.4%) and variant cataloging practice (15.9%) being the next most 
frequent causes. We can see also that 330 or 63.2% of the serious errors 
in the Santa Cruz sample had the effect of contents uncertain and of these 
330 errors, 230 (69.7%) were caused by program processing problems. De- 
leting the contents uncertain category in this case would reduce the total 
number of serious errors in the Author/Title portion of the Santa Cruz 
sample to 192. Program processing errors would then account for 53 or only 
28.1% of this total, fiTid keying errors accounting for 39 or 20.3%. Var- 
iant cataloging practice would then be the cause of the largest number 
(82, or 42.9%) of ^^rious errors in the Sauta Cruz sample. 

The fat^i error category was, fortunately, less problematic thar; 
the serious error category. The serious errors consist of those er*- 
rors which definitely result in a lost entry point (effect 10 — lest entry 
point) and all those which are very likely to result in a lost entry 
point (all those involving a misfile onto another page). Table 9 pro- 
vides data on the cause and effect of fatal errors in the Author/Title 
catalog. One of the most interesting features to be noticed in this 
table is that, for the first time, inadequate design is responsible for 
the plurality of errors. Here, inadequate design has contributed 58 of 
the 141 fatal errors, or 41.1%. Variant cataloging practice is second, 
with 23.4%, and keying errors run a close third (21.3%). Remembering 
that keying errors contributed the overwhelming plurality of all errors 
found in the Author/Title catalog (45.5% versus 20.3% for the second 
most frequent cause), it is interesting to note here that it is a less 
significant factor in the fatal error causes than is inadequate design 
and that It is approximately equal in frequency with variant catalog- 
ing practice. 

It is also worthwhile to point out in Table 10 that in the case of 
fatal errors Santa Cruz records do not contribute a significantly high 
percentage of the errors. Of the 141 fatal errors in the Author/Title 
sample, 24 (or 17.0%) were in Santa Cruz records. The numbers in this 
table are so small that little else can be concluded from them. However, 
it does seem worth noticing that close to half (41.7%) of these errors 
were lost entry points due to causes other than the five specifically 
defined cause categories. 
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4, Location of Appearance of the Errors in the Author/Title Catalog 



We have so far discussed the error aspects of cause, type, and effect. 
Another question which might be asked— particularly by those people con- 
sidering file ii^rovement — is, "Where in the records are the errors 
located?" This is an inqiortant question for people interested in correcting 
errors in the file, since different error correction devices can be used 
on various parts of the records. For example, an authority file for 
author names can be used for finding and correcting errors in author names, 
an English language dictionary authority list can be used on English 
language titles, a subject heading authority list can be used on subject 
headings, and so on. It was hoped that this study would assist those 
Involved in file iiiq)rovement to decide which error correction devices might 
best be employed. 

As mentioned in the methodology section of this report, there was 
sometimes a problem when errors were found in added entries in the 
Author/Title catalog or in entries in the Subject catalog: we could see 
that an error had occurred (for example, failure of entries to consolidate 
correctly) , but the cause and source location of the error could not be 
determined without looking at the full bibliographic record in the 
Author /Title catalog. We wanted to tally errors according to where they 
appeared for various reasons, but primarily in order to estimate accurately 
the amount of space wasted in the catalog due to inappropriate entries 
and non-consolidation of entries. But people interested in file improvement 
would probably be more concerned with the source location of the error 
(that is, where in the full bibliographic record the error appears) and 
not so interested in where the error appears in the subject or added entry. 
Therefore, we recorded the data so that both kinds of error location could 
be tallied in data reduction. 

Let us consider, in Table 11, the Author/Title sample according to 
cause of error and location of error appearance. First we note that of 
the total errors found in the Author/Title catalog sample, over twice as 
many were found in main entries (2,161) as were found in added entries 
(1,002) . This makes sense because there are more data elements appearing 
in the main entries, and therefore more opportunities for errors to appear. 

We can sae from this table that more errors (532) occurred in the 
collation statements of main entries than in any other portion of the 
entries in the Author/Title segment of the sample. (The influence of 
the Santa Cruz records on this figure will be discussed below.) The next 
most frequent location of error appearance was in added entry headings (421). 
Most added entry headings are generated from the title statement of main 
entries (including short title, author statement, editor statement, etc.). 
It is therefore not surprising that the next most freqtient location for 
errors to appear was in those elements which make up the title statment 
of the main entries. The total number of errors found in the title 
statement of main entries was 368. 

The reader may notice that a total of 147 errors were found in the 
call numbers and/or location codes of the entries in this part of the 
sample. This might seem alarming if we assume that errors in call numbers 
or location codes would necessarily lead the catalog user astray. Fortu- 
nately, this is not the case. Most errors found in these parts of the 
entries were very minor — such as missing or added blanks — and generally 
affected only the appearance of the record. 
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It is worthwhile to discuss briefly the corresponding data in Table 
12 for the Santa Cruz subset of this segment of the sample. One of the 
most interesting points illustrated by comparing this table with the 
preceding one is that 311 of the 532 errors appearing in collation 
statements were in Santa Cruz records. This represents 58.5% of all the 
collation statement errors in the Author/Title sample. Such errors may be 
a significant problem, because many errors in collation statements were 
responsible for nonconsolidation (and therefore for misfiling) of entries. 

Of the errors found in the whole Author/Title sample in publisher 
statement and tracings segments of main entries, similarly high percentages 
were contributed by Santa Cruz records. In publisher statements, 175 of 
the 228, or 76.8%, of the errors found were in Santa Cruz records. For 
errors found in tracings, the figures are 168 out of 266, or 63.2%. 
Although there were relatively few (91) errors found in publication dates, 
79.1% (72) of these occurred in Santa Cruz records. We might also note 
that 77 of the 111 (or 69.4%) of the errors found in added entry elements 
other than those specifically defined came from Santa Cruz records. This 
figure reflects the relatively high number of instances of unnecessarily 
(and incorrectly) added data fields found in Santa Cruz records. 

Table 13 correlates errors in the Author/Title Catalog by type and 
location of error appearance. We see that the most common error type was 
that of missing string (769, or 24.3% of all errors in the Author/Title 
segment). The next most frequent error type was that of missing or added 
blanks (597, or 18.9%). Third and fourth most frequent error types were 
other or undetermined misspelling and incorrect or missing capitalization, 
respectively. The four most common error types were all instances of 
orthographic inaccuracy, which might be expected since we have already 
learned that keying erroxs were the mo^t common cause of error (45.5%) 
and were generally the cause of orthographic inaccuracies. All types 
of orthographic iuaccuiracy combined coatrlbuted 2,326, or 54.9% of all 
the errors found in tbe Aathor/Title segmemt:. 

5» Location of Origin of Errors in the Amthor/Title Catalog 

Now let us consider, in Table 14, the errors according to where they 
originated in the source records. The figures in Table 14 table differ 
from Table 11 because 75 of the errors tallied in the general category 
for added entries in Table 11 have been subtracted from that category 
and tallied in various categories for main entry locations. This was done 
because the cause of error in 75 of the added entries was not determinable 
without examining the corresponding main entries. This means that the 

location of these errors was not apparent from the added entries 
themselves. By comparing this table with Table 11 we see that no dramatic 
differences appear; these errors were rather insignificantly dispersed 
through various source locations. One interesting comparison we can note, 
however, is that in considering location of error appearance, 40 of the 
361 (or 10.1%) errors caused by variant cataloging practice occurred in 
the author part of title statements in main entries. When source location 
is considered instead, 55 (or 15.2%) of these 362 errors occurred in the 
author part of main entry title statements. 
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The source location data for the Santa Cruz sample Is presented In 
Table 15, As is true for the subject sample as a whole, few significant 
trends can be found in the data. It may be noted, however, that 29 (32,9%) 
of the 88 errors due to variant cataloging practice originated in the 
author part of the title statement, although only 22 (25%) appeared there. 

Analogous to the immediately preceding tables, tables were also 
generated correlating type with source location of errors. No dramatic 
differences were found in the two ways of tabulating error location. 

6. Other Correlations of Errors in the Author/Title Catalog 

So far we have correlated error cause with the other three main error 
aspects (type, location, and effect); we have correlated error type with 
two other aspects (cause and location); and we have correlated error 
location with two other aspects (cause and type). Error effect has only 
been correlated with cause in the tables describing minor, serious and 
fatal errors. The effect aspect has not been correlated with error 
location or error type, but the source data is available to permit this 
to be done at a later date if desired* 

7. Effect of Non-Consolidation of Entries in the Author/Title Catalog 

Non-consolidation occurred in various forms in the catalog. It 
could occur in entry headings only, in the bodies of entries only, or iu 
both; these three categories of non-consolidation could occur in aain 
entries, added entries, and subject entries. Subject entries couLLd also 
have non-consolidated subject headiaga, with or without any of the other 
kinds of non-consolidation. There are a total of 17 possible coiii)lnation8 
of these factors <^ 

The various types of non-conaolidatioa involved differing degrees cf 
space wasted in the catalog. For example, the non-consolidation of two main, 
entries might result in two inches of column space being wasted, whereas the 
non-consolidation of two added entries might waste only iM^-'inch of space, 
and the non-^consolidation of an entry heading only rright waste even less 
space in the catalogs Consequently, in order to arrive at an accurate 
estimated of the amount of space was^ted in the catalog, it woi Id be necessary 
to consider all 1? of the types of non-consolidation and estimtite the amount 
of space wasted by each, one* 

The effect category, wastetl ^5>ace, was naed for all Intttauces of in- 
appropriate entries and also for instances of added data elements which used 
up at least one extra line of tjrpe. Inappropriate entries could occur as 
main, added^ and siuiject entries. The type of entry would affect the am*)unt 
of space wasted. Added data elements could appear in either added or subject 
entries. It is asa'j«ed that the type of entry in which an added data eleiaeiit 
appears would not substantially affect the amount of space wasted^ 

Detailed study of wasted space in UCUCS would involve a complex analysis 
of all these elements, an .^iaalysis prevented by the time limitations on this 
project. A very rough estimate of space waets^d in UCUCS due to non-consoli- 
dation of entries was made ±ri an unpublished student paper by Judy Todd and 
others for v systems analysis class in the School of Librarifi'xiship, U.C. 
Berkeley. For that paper a saaple of IS piiirs of pages was xeroxed from 
Volxime I (A-Ana) of the Author /Title catalog of UCUCS. These 30 pages 
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were searched for main and added entry non-consolldatlon of entry heading 
only and of the entire entry. Space wasted due to other added data 
elements was not considered In this report. 

Of the sample of 1,895 entries, 16.1% were duplicate entries, that Is, 
redundant entries which should have consolidated with other entry headings 
or whole entries. These diq;>llcatea were analyzed according to the following 
categories: type of entry (added entry or main entry), area of duplication 
discrepancies, typographical errors, lack of Automatic Format Recognition 
program, undetermined). Results of this particular study Indicated an 
average of about 3 columa-lnches per page or about 10.4% of the printed 
colum-space were taken up by duplicate entries. As the sample size for 
this study was small and as added data elements were not considered In 
rough and possibly conservative estimate of wasted space In the UCUCS 
catalog because of failure to consolidate entries. 
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c. 



DISPLAY AND DISCUSSION OF FINDINGS FOR THE SUBJECT CATALOG 



1. Summary of Error Rate 

The summary data for the number of errors found in the Subject catalog 
during this study are repeated here from Table 1. 



NUMBER OF ERRORS 



FATAL 



SERIOUS 



MINOR 



TOTAL 



Subject Catalog 159(13.6%) 490(41.8%) 522(44.6%) 1, 171 100.0%) 



The estimated Subject Catalog error rates are repeated here from 
Table 2. 



ERROR RATE 



Errors per page 
Errors per entry 



FATAL 
4.8 
0.07 



SERIOUS 
14.8 
0.21 



MINOR 
15.8 
0.23 



TOTAL 
35.4 
0.51 



It may be repeated here that there were discrepancies in totals of 
some of the tables, particularly for the Subject section of the sample. 
For instance, tables correlating cause and effect for minor, serious, 
and fatal errors total 1,171, but the table correlating cause and type 
for the Subject sample totals 1,193. The discrepancies have no great 
statistical significance. Percentages within tables presented here are 
percentages of the total given in that particular table. 

2. Causes of Errors in the Subject Catalog 

Figure 1 and Section III. C 4. listed and discussed the causes that 
were to be considered for this study, and all errors were attributed to one 
of these categories of causes. The gross distribution of total errors 
(fatal, serious, and minor) by cause is given in Table 16. 

We can see from this table that keying errors caused the largest per- 
centage of errors (39.0%) found in the Subject catalog sample. The uext 
highest percentage (22.4%) was contributed by program processing failure. 
These findings are consistent with those for the Author/Title catalog, 
where keying errors also contributed the highest percentage (45.5%) and 
program processing failure the next highest percentage (20.3%) of errors. 

The most common error type in the Subject catalog was found to be 
orthographic inaccuracy, which totalled 718, or 60.2%, of the 1,193 errors 
in the Subject sample. (Orthographic inaccuracy was also the most common 
error type in the Author/Title catalog, where 73.6% of the errors were of 
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this error type.) As can be seen In Table 17, (errors in Santa Cruz 
records by type and cause), keying errors were most commonly respDnslble 
for orthographic inaccuracies, accounting for 64.5% of this error type 
In the Subject sample. As was mentioned In the discussion of such errors 
In the Author/Title catalog (Section IV. B. 2.), orthographic inaccuracies 
usually have only minor effects, but occasionally may be responsible for 
more serious errors, such as non-consolidation of entries. 

The Santa Cruz sample had characteristics somewhat different from 
the sample as a whole. Here, for Instance, most errors (46.8%) were due 
to program processing failure rather than keying mistakes; keying errors 
do account for the next highest percentage (26.5%) of errors, however. 
As for the Subject catalog as a whole, the most comioon error type was 
orthographic Inaccuracy, which accounts for 162, or 45.1%, of the 359 
errors in the Santa Cruz Subject catalog sample, and over half (56.8%) of 
these orthographic inaccuracies are the result of keying errors. However, 
the Santa Cruz subset, unlike the Subject Catalog as a whole, has a 
significant percentage (39.8%) of errors categorized as "data field added." 
Further, it is this error type which is most closely associated with program 
processing failure, which accounts for 138, or 96.5%, of the 143 "data field 
added" errors. (This error type is not as serious as some other types, 
but may result in wasted space. Programming limitations which may have 
caused this error type were discussed previously in the section on method- 
ology. Section III. C. 1.) 

A comparison of Tables 16 and 17 shows that the Santa Cruz records 
account for 30.1% of all the errors found in the Subject catalog sample. 
In particular, it may be noted that 62.9% of the errors caused by program 
processing failure occurred in the Santa Cruz records. (This percentage is 
somewhat lower than the percentage of program processing errors in the 
Author/Title catalog attributable to Santa Cruz records; a discussion of 
where the program failure could have occurred may be found in Section IV • 
B. 2., which deals with causes of error in the Author/Title catalog.) It 
may also be noted that Santa Cruz records account for 64, or 49.2%, of the 
130 errors in the Subject catalog sample which are due to variant cataloging 
practice. However, although in the Author /Title catalog the majority of 
records, in the Subject catalog the Santa Cruz sample contributed only 
19 of the 173 errors with "unknown" causes. 

3. Seriousness of Errors in the Subject Catalog 

As in the Author/Title catalog, errors in the Subject Catalog were 
categorized as minor, serious and fatal, with the most serious effect 
being chosen when an error had more than one effect. 

Minor errors, as mentioned in Section IV. B. 3., have three effect 
categories: wasted space, appearance of the entry ^ and other or unknown. 
Table 18 displays these errors according to cause and effect. There were 
522 minor errors in the Subject catalog sample, or 44.6% of all errors in 
the Subject sample. .(In the Author/Title segment minor errors accounted 
for 51.5% of all errors.) Unlike the Author /Title catalog, where keying 
mistakes far outnumbered program processing errors (55.3% and 18.7% re- 
spectively) keying mistakes accounted for only a slightly higher percentage 
(34.7%) of the minor errors than did program processing failure (34.3%) 
in the Subject sample. Each of these causes was primarily related to a 
different effect. For instance, 177, or 97.8%, out of 181 keying errors 
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affected the appearance of the entry only; further, these keying errors 
accounted for 90.3% of all those minor errors affecting appearance. On 
the other hand, 141, or 78.8%, of program processing problems resulted in 
wasted space, and these program processing errors accounted for over half 
(52.6%) of the total number of minor errors causing wasted space. 

The Santa Cruz records account for 37.2% of all the minor errors 
in the Subject catalog, a figure roughly proportionate to the percentage 
of minor errors in the Author/Title catalog contributed by the Santa Cruz 
sample (40.8%). From Table 19, however, we can see that program processing 
problems caused 143, or 73.7% of the 194 minor errors in the Siata Cruz 
records (keying errors account for only 21.2% of minor errors in these 
records), thus accounting for 79.9% of all the minor errors caused by 
programming problems in the Subject catalog. 

For the Santa Cruz sample, as for the whole Subject catalog sample, 
the most common result of program processing errors was wasted space, and 
conversely, 114, or 91.2%, of the 125 minor errors causing wasted space 
in the Santa Cruz sample of the Subject catalog were the result of 
program processing failure. 

Serious errors represent 41.8% of the serious errors in the Subject 
catalog (see Table 1). As may be seen in Table 20, keying errors account 
for over half (55.3%) of these errors^ with variant cataloging practice 
and program processing the next most frequent causes (20.0% and 16.5%, 
respectively). It is interesting to note on Table 21 that for the Santa 
Cruz section of the subject sample, variant cataloging practice accounted 
for slightly more serious errors than did keying errors (38.8% and 38.1%, 
respectively). Further, although in the entire Subject catalog sample 
— as in the Author /Title catalog — a large proportion (43.1%) of the 
serious errors resulted in the effect called "contents uncertain" in the 
Santa Cruz section of the subject eample^ the largest proportion of the 
errors (41.0%) resulted in non-consolidation of entry in the same column 
without heading. The "contents uncertain" category accounted for only 20.2% 
of the serious errors in the Santa Cruz subject sample. 

It was pointed out in the discussion of serious errors in Section IV. 
B, 3. that describing errors in the "content uncertain" category as "serious" 
was somewhat arbitrary and that it might be therefore useful to review the 
results of the errors under "contents uncertain" were considered minor 
instead of serioiis. Deleting the "contents uncertain" category would 
reduce the total number of serious errors in the subject sample from 490 
to 279. The number and relative proportion of errors due to keying 
mistakes would be reduced to 124 or 44.4% of the new total. The number 
of errors due to program processing mistakes would be reduced to 30 
(10.8%). On the other hand, although the number of errors due to variant 
cataloging practice would stay the same, the proportion of these errors 
relative to the total would be increased to 35.1%. Similarly, in the 
Santa Cruz portion of this sample, keying errors would be reduced to 
29 (33.7%) out of new total of 86 serious errors. The number of errors 
due to variant cataloging and program processing would change little, 
but the proportion of variant cataloging practice errors relative to the 
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new total for the Santa Cruz subject sample would increase to 62.8%. 
Program processing would then account for 23.3% of the 86 serious errors. 
Further, if the "contents uncertain'' category were deleted, the effect 
category "non-consolidated entry in same column without heading" would 
represent by far the largest proportion of serious errors in the subject 
sample, accounting for 66.3% of the Santa Cruz section and for 52.3% of 
the whole Subject catalog sample. 

As can be seen in Table 1, the proportion of serious errors to the 
total number of errors in the Subject catalog is comparable to the pro- 
portion of these errors to the total in the Author/Title catalog. There 
are proportionately fewer minor errors in the Subject catalog (44.6% of 
the subject sample) than in the Author/Title catalog (51.5%). (As mentioned 
earlier, far more of these errors are caused by program processing problems 
in the Subject catalog than are so caused in the Author/Title catalog.) 
It is interesting to note in Table 22, then, that fatal errors are signifi- 
cantly more numerous in the Subject catalog than in the Author/Title 
catalog; fatal errors account for 13.6% of the total errors in the subject 
sample as opposed to accounting for 4.4% of the total errors in the 
Author /Title catalog. Further, as can be seen in Table 2, the rate of 
fatal errors per page in the Subject catalog (4.8%) is over twice that of 
the Author/Title catalog (2.3%). Unfortunately, little can be noted about 
the factors determining these differences. By far the largest proportion 
of fatal errors (72.3%) are ascribed to "unknown" causes, a category used 
whenever there was great uncertainty as to the specific cause of the 
error. Variant cataloging practice is responsible for the next largest 
proportion (13.2%). One may note that 114 out of the 115 fatal errors 
due to "unknown" causes resulted in misfiling off the page. (Such misfiles 
account for 80.5% of the total fatal errors in the Subject catalog.) In 
the Author/Title catalog, by contrast, inadequate design was responsible 
for the plurality of fatal errors (41.1%), with variant cataloging practice 
(23.4%) and keying mistakes (21.3%) the next most significant causes. For 
the Subject catalog, inadequate design accounted for only 2.5% and keying 
err OS for only 8.8% of all 159 fatal errors. 

As was true of fatal errors in the Author/Title catalog, only a small 
percentage (9.4%) of the fatal errors in the Subject catalog occurred in 
in the Santa Cruz records. The number of errors displayed in Table 23 are 
too small to be of great statistical significance, but one may note that 
8 of the 15 fatal errors in the Santa Cruz sample were due to variant 
cataloging practice, and 4 of the 15 ascribed to unknown causes. 

4. Location of Appearance of the Errors in the Subject C atalog 

Table 24 displays data correlating the cause of errors in the Subject 
catalog with the location of the appearance of the errors. Errors in the 
Subject catalog sample occurred most frequently in the subject heading; 
31.7% of the errors in the sample were found in subject headings. Over 
a third (137, or 36.2%) of the errors in the subject headings were ascribed 
to "unknown" or uncertain causes; slightly fewer (110, or 29.1%) were due 
to feying mistakes. The next most frequent location of error appearance 
was title statement (255, or 21.4% of the total); most of these errors 
w^re caused by keying mistakes. 
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In the Santa Cruz sample, however, errors occurred most frequently 
In entry elements other than those specifically defined. Table 25 shows 
that almost 40% of the errors in the Santa Cruz sample occurred in such 
entry elements, and that program processing was responsible for almost 
all of these errors. Comparing Tables 24 and 25, one can see that Santa 
Cruz records were responsible for 143 (65.3%) of the 219 errors found in 
subject entry elements other than those specifically defined and for 137 
(82.0%) of the 167 such errors which were ascribed to program processing 
failure* As discussed in Section IV. B. 4. , the Santa Cruz data for 
errors in added entries in the Author/Title catalog showed similar trends, 
and, as noted there, these figures reflect a relatively high proportion of 
added data fields found in Santa Cruz records (see also Table 27). 

The data in Tables 26 and 27, which correlate location of appearance 
of errors with error type, reflects the trends noted in Tables 24 and 25* 
For example, it may be seen from Table 26 that orthographic inaccuracies 
account for most of the errors located in the subject heading and title 
statement (and for 60.2% of the total number of errors in the Subject 
sample) , as might be expected from the information in Table 24 that a 
substantial number of such errors were caused by keing mistakes. 

It may also be noted that outside uhe orthographic mistakes, the 
single largest error type in the Subject sample was "data field added" 
(18.2%), and that most of these errors occurred in subject entry elements 
other than those specifically defined. Further, a comparison of Tables 26 
and 27 shows that the Santa Cruz records were responsible for 143, or 
and, again, that most of this type of error in the Santa Cruz records 
were located in entry elements other than the specifically defined cate- 
gories. It was pointed out in the discussion of error type and cause 
(Section IV. C. 2.) th?.t "data field added" was the largest single error 
type in the Santa Cruz subject sample, accounting for almost 40% of the 
errors in that subset, and is closely associated with program processing 
failure in that subset. It can be seen from Table 27 that the primary 
result of this error type in the Subject catalog and from the program 
processing problems which caused it is wasted space in the addition of 
unnecessary entry elements. 

5. Location of Origin of Errors in the Subject Catalog 

Tables 28 and 29 correlate cause of errors in the Subject catalog with 
the location of the source of the errors. As explained earlier in the 
discussion of the corresponding tables for the Author/Title catalog (Tables 
14 and 15), these tables differ from Tables 24 and 25 in that, whenever 
the location of the cause of an error differed £rom the location of the 
appearance of that error in the Subject entry, it was subtracted from the 
general subject entry category (used for tallying such errors) and tallied 
in main entry location where the error originated. 

There were 92 errors in the Subject catalog sample for which the 
source location differed from the location of appearance; 54 (58.7%) of 
these were from Santa Cruz records. As was true for the Author /Title 
catalog, these errors were dispersed in various source locations rather 
than in a significant few. It may be noted, however, that of the 41 
errors caused by variant cataloging practice whose source location 
differed from the location of appearance, 19 (46.3%) originated in the 
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author part of title statements in main entries. In the Santa Cruz 
re cords y 11 (45.8%) of 24 such errors were traced to the author part 
of the title statement. A similar trend was noted earlier for added 
entries In the Author/Title catalog* 

6. Other Correlations of Errors in the Sublect Catalog 

As for errors in the Author/Title catalog saiqpley aspects of errors 
in the Subject catalog have now been correlated and discussed in a number 
of ways. Error cause has been correlated with three other aspects: type, 
location (both location of origin and of appearance) , and effect. Error 
type has been correlated with cause and location of appearance. Error 
effect has been correlated with cause in tables describing minor, serious, 
and fatal errors, but not with either location or type. However, the 
source data is available to permit this to be done at a later date if 
desired. 

7. Effect of Non-Consolidation of Entries in the Subject Catalog 

As mentioned in the analysis of the Author/Title catalog, time limi- 
tations prevented study of the wasted space in UCUCS. The rough estimate 
of 3 colunn-lnches per page given in that section was based on a sample 
from the Author/Title catalog. More detailed breakdown and analysis of the 
non-consolidation and wasted space categories of the Subject catalog would 
be necessary before a reliable estimate could be made of the space wasted 
in the Subject catalog. Such analysis cannot be done here, again due to 
time limitations, but it is to be hoped that some future studies may deal 
with this problem. 
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